CN107068154A - The method and system of authentication based on Application on Voiceprint Recognition - Google Patents
The method and system of authentication based on Application on Voiceprint Recognition Download PDFInfo
- Publication number
- CN107068154A CN107068154A CN201710147695.XA CN201710147695A CN107068154A CN 107068154 A CN107068154 A CN 107068154A CN 201710147695 A CN201710147695 A CN 201710147695A CN 107068154 A CN107068154 A CN 107068154A
- Authority
- CN
- China
- Prior art keywords
- vocal print
- speech data
- authentication
- print feature
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 230000001755 vocal effect Effects 0.000 claims abstract description 177
- 239000013598 vector Substances 0.000 claims abstract description 73
- 238000012549 training Methods 0.000 claims abstract description 63
- 238000001228 spectrum Methods 0.000 claims description 23
- 238000012545 processing Methods 0.000 claims description 11
- 238000009432 framing Methods 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 9
- 239000000203 mixture Substances 0.000 claims description 5
- 230000008859 change Effects 0.000 claims description 2
- 239000011159 matrix material Substances 0.000 description 58
- 238000009434 installation Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 5
- 238000001914 filtration Methods 0.000 description 4
- 238000005259 measurement Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 238000009412 basement excavation Methods 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000017105 transposition Effects 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/08—Network architectures or network communication protocols for network security for authentication of entities
- H04L63/0861—Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Computer Security & Cryptography (AREA)
- Computer Hardware Design (AREA)
- Computer Networks & Wireless Communication (AREA)
- Biomedical Technology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Business, Economics & Management (AREA)
- General Engineering & Computer Science (AREA)
- Game Theory and Decision Science (AREA)
- Collating Specific Patterns (AREA)
- Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
- Telephonic Communication Services (AREA)
Abstract
The present invention relates to a kind of method and system of the authentication based on Application on Voiceprint Recognition, the method for the authentication based on Application on Voiceprint Recognition includes:After the speech data for the user for carrying out authentication is received, the vocal print feature of the speech data is obtained, and corresponding vocal print feature vector is built based on the vocal print feature;By the background channel model of vocal print feature vector input training in advance generation, to construct the corresponding current vocal print discriminant vectorses of the speech data;The space length between the standard vocal print discriminant vectorses of the current vocal print discriminant vectorses and the user prestored is calculated, authentication is carried out to the user based on the distance, and generate the result.The present invention can improve the accuracy rate and efficiency of subscriber authentication.
Description
Technical field
The present invention relates to communication technical field, more particularly to a kind of authentication based on Application on Voiceprint Recognition method and be
System.
Background technology
At present, the scope of business of large-scale financing corporation is related to multiple business such as insurance, bank, investment, each business
Category is generally required for same client to be linked up, and the mode of communication has a variety of (such as telephonic communications or communication face-to-face).
Before being linked up, the identity to client carries out checking as the important component for ensureing service security.In order to meet industry
The real-time demand of business, financing corporation generally carries out analysis checking using manual type to the identity of client.Due to customer group
Huge, by artificial progress discriminant analysis, accuracy is not also high in the way of the identity to verifying client, and efficiency is also low.
The content of the invention
It is an object of the invention to provide a kind of method and system of the authentication based on Application on Voiceprint Recognition, it is intended to improves and uses
The accuracy rate and efficiency of family authentication.
To achieve the above object, the present invention provides a kind of method of the authentication based on Application on Voiceprint Recognition, described to be based on sound
The method of the authentication of line identification includes:
S1, after the speech data for the user for carrying out authentication is received, obtains the vocal print feature of the speech data,
And corresponding vocal print feature vector is built based on the vocal print feature;
S2, by the background channel model of vocal print feature vector input training in advance generation, to construct the voice
The corresponding current vocal print discriminant vectorses of data;
S3, calculates the space between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored
Distance, carries out authentication, and generate the result based on the distance to the user.
Preferably, the step S1 includes:
S11, preemphasis, framing and windowing process are carried out to the speech data;
S12, carries out Fourier transform to each adding window and obtains corresponding frequency spectrum;
S13, inputs Mel wave filter by the frequency spectrum and obtains Mel frequency spectrum to export;
S14, carries out cepstral analysis to obtain mel-frequency cepstrum coefficient MFCC, based on the Mel on Mel frequency spectrum
Frequency cepstral coefficient MFCC constitutes corresponding vocal print feature vector.
Preferably, the step S3 includes:
S31, calculates remaining between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored
Chordal distance: For the standard vocal print discriminant vectorses,For current vocal print discriminant vectorses;
S32, if the COS distance is less than or equal to default distance threshold, generates the information being verified;
S33, if the COS distance is more than default distance threshold, the information that generation checking does not pass through.
Preferably, the background channel model includes before being gauss hybrid models, the step S1:
The speech data sample of predetermined number is obtained, and obtains the corresponding vocal print feature of each speech data sample, and is based on
The corresponding vocal print feature vector of each speech data sample of each corresponding vocal print feature structure of speech data sample;
The corresponding vocal print feature vector of each speech data sample is divided into the training set of the first ratio and testing for the second ratio
Card collection, first ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training,
Verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
If the accuracy rate is more than predetermined threshold value, model training terminates, and institute is used as using the gauss hybrid models after training
Step S2 background channel model is stated, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the speech data sample
This quantity, and training is re-started based on the speech data sample after increase.
Preferably, the step S3 is replaced with:Each standard vocal print for calculating the current vocal print discriminant vectorses and prestoring reflects
Space length not between vector, obtains minimum space length, and body is carried out to the user based on the minimum space length
Part checking, and generate the result.
To achieve the above object, the present invention also provides a kind of system of the authentication based on Application on Voiceprint Recognition, described to be based on
The system of the authentication of Application on Voiceprint Recognition includes:
First acquisition module, for after the speech data for the user for carrying out authentication is received, obtaining the voice
The vocal print feature of data, and corresponding vocal print feature vector is built based on the vocal print feature;
Module is built, for by the background channel model of vocal print feature vector input training in advance generation, to build
Go out the corresponding current vocal print discriminant vectorses of the speech data;
First authentication module, the standard vocal print of the user for calculating the current vocal print discriminant vectorses and prestoring differentiates
Space length between vector, carries out authentication, and generate the result based on the distance to the user.
Preferably, first acquisition module to the speech data specifically for carrying out at preemphasis, framing and adding window
Reason;Fourier transform is carried out to each adding window and obtains corresponding frequency spectrum;The frequency spectrum is inputted Mel wave filter to export
To Mel frequency spectrum;Cepstral analysis is carried out on Mel frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, based on the Mel
Frequency cepstral coefficient MFCC constitutes corresponding vocal print feature vector.
Preferably, first authentication module is specifically for the calculating current vocal print discriminant vectorses and the user prestored
Standard vocal print discriminant vectorses between COS distance: For the standard vocal print discriminant vectorses,For
Current vocal print discriminant vectorses;If the COS distance is less than or equal to default distance threshold, the letter being verified is generated
Breath;If the COS distance is more than default distance threshold, the information that generation checking does not pass through.
Preferably, the system of the authentication based on Application on Voiceprint Recognition also includes:
Second acquisition module, for obtaining the speech data sample of predetermined number, and obtains each speech data sample correspondence
Vocal print feature, and the corresponding vocal print feature of each speech data sample is built based on the corresponding vocal print feature of each speech data sample
Vector;
Division module, for the corresponding vocal print feature vector of each speech data sample is divided into the first ratio training set and
The checking collection of second ratio, first ratio and the second ratio and less than or equal to 1;
Training module, for being trained using the vocal print feature vector in the training set to gauss hybrid models, and
After the completion of training, verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
Processing module, if being more than predetermined threshold value for the accuracy rate, model training terminates, and is mixed with the Gauss after training
Matched moulds type as the background channel model, or, if the accuracy rate be less than or equal to predetermined threshold value, increase the voice number
Training is re-started according to the quantity of sample, and based on the speech data sample after increase.
Preferably, first authentication module replaces with the second authentication module, for calculate the current vocal print differentiate to
Space length between each standard vocal print discriminant vectorses measured and prestored, obtains minimum space length, based on described minimum
Space length carries out authentication to the user, and generates the result.
The beneficial effects of the invention are as follows:The background channel model of training in advance generation of the present invention is by a large amount of voice numbers
According to excavation obtained with comparing training, this model can be accurate to carve while the vocal print feature of user is retained to greatest extent
Background vocal print feature when user speaks is drawn, and can be removed this feature in identification, and extracts the intrinsic of user voice
Feature, can significantly improve the accuracy rate of subscriber authentication, and improve the efficiency of authentication.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the method preferred embodiment of the authentication of the invention based on Application on Voiceprint Recognition;
Fig. 2 is the refinement schematic flow sheet of step S1 shown in Fig. 1;
Fig. 3 is the refinement schematic flow sheet of step S3 shown in Fig. 1;
Fig. 4 is the running environment schematic diagram of the system preferred embodiment of the authentication of the invention based on Application on Voiceprint Recognition;
Fig. 5 is the structural representation of the system preferred embodiment of the authentication of the invention based on Application on Voiceprint Recognition.
Embodiment
The principle and feature of the present invention are described below in conjunction with accompanying drawing, the given examples are served only to explain the present invention, and
It is non-to be used to limit the scope of the present invention.
As shown in figure 1, Fig. 1 is the flow signal of the embodiment of method one of the authentication of the invention based on Application on Voiceprint Recognition
Figure, being somebody's turn to do the method for the authentication based on Application on Voiceprint Recognition can be performed by the system of an authentication based on Application on Voiceprint Recognition, should
System can realize by software and/or hardware, and the system can it is integrated in the server.The identity based on Application on Voiceprint Recognition is tested
The method of card comprises the following steps:
Step S1, after the speech data for the user for carrying out authentication is received, obtains the vocal print of the speech data
Feature, and corresponding vocal print feature vector is built based on the vocal print feature;
In the present embodiment, speech data collects (voice capture device is, for example, microphone) by voice capture device,
The system that the speech data of collection is sent to the authentication based on Application on Voiceprint Recognition by voice capture device.
When gathering speech data, should try one's best prevents the interference of ambient noise and voice capture device.Voice capture device
Suitable distance is kept with user, and as far as possible without the big voice capture device of distortion, power supply preferably uses civil power, and keeps electric current
It is stable;Sensor should be used when carrying out telephonograph., can be to voice number before the vocal print feature in extracting speech data
According to carrying out going noise treatment, disturbed with further reduce.In order to extract the vocal print feature for obtaining speech data, gathered
Speech data is the speech data of preset data length, or is the speech data more than preset data length.
Vocal print feature includes polytype, such as broadband vocal print, arrowband vocal print, amplitude vocal print, the vocal print of the present embodiment
Be characterized as preferably speech data mel-frequency cepstrum coefficient (Mel Frequency Cepstrum Coefficient,
MFCC).When building corresponding vocal print feature vector, by the vocal print feature composition characteristic data matrix of speech data, this feature
Data matrix is the vocal print feature vector of speech data.
Step S2, by the background channel model of vocal print feature vector input training in advance generation, described in constructing
The corresponding current vocal print discriminant vectorses of speech data;
Wherein, by the background channel model of vocal print feature vector input training in advance generation, it is preferable that the background channel mould
Type is gauss hybrid models, and vocal print feature vector is calculated using the background channel model, show that corresponding current vocal print differentiates
Vectorial (i.e. i-vector).
Specifically, the calculating process includes:
1) Gauss model, is selected:First, every frame data are calculated using the parameter in common background channel model in difference
The likelihood logarithm value of Gauss model, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally
Obtain a matrix per frame data numerical value in mixed Gauss model:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T,
Wherein, Loglike is likelihood logarithm value matrix, and E (X) is that common background channel model trains the average square come
Battle array, D (X) is covariance matrix, and X is data matrix, X.2Each it is worth for matrix squared.
2) posterior probability, is calculated:X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, three can be reduced to down
Angular moment battle array, and element is arranged as 1 row in order, become the vector that a N frame is multiplied by the lower triangular matrix number latitude
Calculated, the vector of all frames is combined into new data matrix, while the association for probability being calculated in universal background model
Variance matrix, each matrix is also reduced to lower triangular matrix, become with matrix as new data matrix class, passing through common background
Mean Matrix and covariance matrix in channel model calculate the likelihood logarithm under the Gauss model of the selection of every frame data
Value, then carries out Softmax recurrence, operation is finally normalized, every frame is obtained in mixed Gauss model Posterior probability distribution,
The ProbabilityDistribution Vector of every frame is constituted into probability matrix.
3) current vocal print discriminant vectorses, are extracted:Carry out single order first, the calculating of second order coefficient, coefficient of first order is calculated can be with
Obtained by probability matrix row summation:
Wherein, GammaiFor i-th of element of coefficient of first order vector, loglikesjiFor
The jth row of probability matrix, i-th of element.
Second order coefficient can be multiplied by data matrix acquisition by the transposition of probability matrix:
X=LoglikeT* feats, wherein, X is second order coefficient matrix, and loglike is probability matrix, and feats is characterized
Data matrix.
After calculating obtains single order, second order coefficient, then parallel computation first order and quadratic term pass through first order and two
Secondary item calculates current vocal print discriminant vectorses.
Preferably, background channel model is gauss hybrid models, is included before above-mentioned steps S1:
The speech data sample of predetermined number is obtained, and obtains the corresponding vocal print feature of each speech data sample, and is based on
The corresponding vocal print feature vector of each speech data sample of each corresponding vocal print feature structure of speech data sample;
The corresponding vocal print feature vector of each speech data sample is divided into the training set of the first ratio and testing for the second ratio
Card collection, first ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training,
Verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
If the accuracy rate is more than predetermined threshold value, model training terminates, and institute is used as using the gauss hybrid models after training
Step S2 background channel model is stated, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the speech data sample
This quantity, and training is re-started based on the speech data sample after increase.
Wherein, when the vocal print feature vector in using training set is trained to gauss hybrid models, the D extracted
The corresponding likelihood probability of dimension vocal print feature can be expressed as with K Gaussian component:
Wherein, P (x) is the probability (mixing that speech data sample is generated by gauss hybrid models
Gauss model), wkFor the weight of each Gauss model, the probability that p (x | k) generate for sample by k-th of Gauss model, K is high
This model quantity.
The parameter of whole gauss hybrid models can be expressed as:{wi,μi,Σi, wiFor the weight of i-th of Gauss model, μi
For the average of i-th of Gauss model, ∑iFor the covariance of i-th of Gauss model.The gauss hybrid models are trained to use non-prison
The EM algorithms superintended and directed.After the completion of training, the weight vectors of gauss hybrid models, constant vector, N number of covariance matrix, average are obtained
It is multiplied by matrix of covariance etc., the gauss hybrid models after as one training.
Step S3, is calculated between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored
Space length, carries out authentication, and generate the result based on the distance to the user.
Vector has a variety of with the distance between vector, including COS distance and Euclidean distance etc., it is preferable that the present embodiment
Space length be COS distance, COS distance be using two vectorial angle cosine values in vector space be used as measurement two
The measurement of the size of interindividual variation.
Wherein, standard vocal print discriminant vectorses are the vocal print discriminant vectorses for being obtained ahead of time and storing, standard vocal print discriminant vectorses
The identification information of its corresponding user is carried in storage, it is capable of the identity of the corresponding user of accurate representation.Calculating space
Before distance, the identification information provided according to user obtains the vocal print discriminant vectorses of storage.
Wherein, when calculating obtained space length less than or equal to pre-determined distance threshold value, it is verified, conversely, then verifying
Failure.
Compared with prior art, the background channel model of the present embodiment training in advance generation is by a large amount of speech datas
Excavation obtained with comparing training, this model can to greatest extent retain user vocal print feature while, accurately portray
Background vocal print feature when user speaks, and can remove this feature in identification, and extract the intrinsic spy of user voice
Levy, can significantly improve the accuracy rate of subscriber authentication, and improve the efficiency of authentication;In addition, the present embodiment is abundant
Vocal print feature related to sound channel in voice is make use of, this vocal print feature simultaneously need not be any limitation as, thus entering to text
There is larger flexibility during row identification and checking.
In a preferred embodiment, as shown in Fig. 2 on the basis of above-mentioned Fig. 1 embodiment, above-mentioned steps S1 bags
Include:
Step S11, preemphasis, framing and windowing process are carried out to the speech data;In the present embodiment, receive into
After the speech data of the user of row authentication, speech data is handled.Wherein, preemphasis processing is really high-pass filtering
Processing, filters out low-frequency data so that the high frequency characteristics in speech data is more highlighted, specifically, the transmission function of high-pass filtering
For:H (Z)=1- α Z-1, wherein, Z is speech data, and α is constant factor, it is preferable that α value is 0.97;Due to voice signal
Stationarity is only presented within a short period of time, therefore one section of voice signal is divided into the signal (i.e. N frames) of N sections of short time, and in order to
Avoid the continuity Characteristics of sound from losing, there is one section of repeat region between consecutive frame, repeat region is generally 1/2 per frame length;
After framing is carried out to speech data, each frame signal is handled all as stationary signal, but the presence of Gibbs' effect, voice
The start frame and end frame of data are discontinuous, after framing, more away from raw tone, accordingly, it would be desirable to voice number
According to progress windowing process.
Step S12, carries out Fourier transform to each adding window and obtains corresponding frequency spectrum;
Step S13, inputs Mel wave filter by the frequency spectrum and obtains Mel frequency spectrum to export;
Step S14, carries out cepstral analysis to obtain mel-frequency cepstrum coefficient MFCC, based on described on Mel frequency spectrum
Mel-frequency cepstrum coefficient MFCC constitutes corresponding vocal print feature vector.Wherein, cepstral analysis is, for example, to take the logarithm, do inversion
Change, inverse transformation is realized generally by DCT discrete cosine transforms, take the 2nd after DCT to the 13rd coefficient as MFCC systems
Number.Mel-frequency cepstrum coefficient MFCC is the vocal print feature of this frame speech data, by the mel-frequency cepstrum coefficient of every frame
MFCC composition characteristic data matrixes, this feature data matrix is the vocal print feature vector of speech data.
In a preferred embodiment, as shown in figure 3, on the basis of above-mentioned Fig. 1 embodiment, upper step S3 includes:
Step S31, is calculated between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored
COS distance: For the standard vocal print discriminant vectorses,For current vocal print discriminant vectorses;
Step S32, if the COS distance is less than or equal to default distance threshold, generates the letter being verified
Breath;
Step S33, if the COS distance is more than default distance threshold, the information that generation checking does not pass through.
In a preferred embodiment, on the basis of above-mentioned Fig. 1 embodiment, above-mentioned step S3 is replaced with:Calculate
Space length between the current vocal print discriminant vectorses and each standard vocal print discriminant vectorses prestored, obtain minimum space away from
From carrying out authentication to the user based on the minimum space length, and generate the result.
The present embodiment from unlike Fig. 1 embodiment, the present embodiment store standard vocal print discriminant vectorses when do not take
Identification information with user, when verifying the identity of user, calculates current vocal print discriminant vectorses and each standard vocal print prestored reflects
Space length not between vector, and the space length of minimum is obtained, if the minimum space length is less than default distance
Threshold value (distance threshold identical with the distance threshold of above-described embodiment or difference), then be verified, otherwise authentication failed.
Referring to Fig. 4, Fig. 4 is the operation ring of the preferred embodiment of system 10 of the authentication of the invention based on Application on Voiceprint Recognition
Border schematic diagram.
In the present embodiment, the system 10 of the authentication based on Application on Voiceprint Recognition is installed and run in electronic installation 1.Electricity
Sub-device 1 can be the computing devices such as desktop PC, notebook, palm PC and server.The electronic installation 1 can be wrapped
Include, but be not limited only to, memory 11, processor 12 and display 13.Fig. 1 illustrate only the electronic installation with component 11-13
1, it should be understood that being not required for implementing all components shown, the more or less component of the implementation that can be substituted.
Memory 11 can be the internal storage unit of electronic installation 1 in certain embodiments, such as electronic installation 1
Hard disk or internal memory.Memory 11 can also be the External memory equipment of electronic installation 1 in further embodiments, and for example electronics is filled
Put the plug-in type hard disk being equipped with 1, intelligent memory card (Smart Media Card, SMC), secure digital (Secure
Digital, SD) card, flash card (Flash Card) etc..Further, memory 11 can also be both interior including electronic installation 1
Portion's memory cell also includes External memory equipment.Memory 11, which is used to store, is installed on the application software of electronic installation 1 and all kinds of
Data, such as the program code of the system 10 of the authentication based on Application on Voiceprint Recognition.Memory 11 can be also used for temporarily
Store the data that has exported or will export.
Processor 12 can be in certain embodiments a central processing unit (Central Processing Unit,
CPU), microprocessor or other data processing chips, for the program code or processing data stored in run memory 11, example
Such as perform the system 10 of the authentication based on Application on Voiceprint Recognition.
Display 13 can be in certain embodiments light-emitting diode display, liquid crystal display, touch-control liquid crystal display and
OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) touches device etc..Display 13 is used to be shown in
The information that is handled in electronic installation 1 and for showing visual user interface, such as Application on Voiceprint Recognition interface.Electronic installation
1 part 11-13 is in communication with each other by system bus.
Referring to Fig. 5, being the functional module of the preferred embodiment of system 10 of the authentication of the invention based on Application on Voiceprint Recognition
Figure.In the present embodiment, the system 10 of the authentication based on Application on Voiceprint Recognition can be divided into one or more modules, one
Or multiple modules are stored in memory 11, and held by one or more processors (the present embodiment is by processor 12)
OK, to complete the present invention.For example, in Figure 5, the system 10 of the authentication based on Application on Voiceprint Recognition can be divided into detecting mould
Block 21, identification module 22, replication module 23, installation module 24 and starting module 25.Module alleged by the present invention is to have referred to
Into the series of computation machine programmed instruction section of specific function, than program more suitable for describing the authentication based on Application on Voiceprint Recognition
The implementation procedure of system 10 in the electronic apparatus 1, wherein:
First acquisition module 101, for after the speech data for the user for carrying out authentication is received, obtaining institute's predicate
The vocal print feature of sound data, and corresponding vocal print feature vector is built based on the vocal print feature;
In the present embodiment, speech data collects (voice capture device is, for example, microphone) by voice capture device,
The system that the speech data of collection is sent to the authentication based on Application on Voiceprint Recognition by voice capture device.
When gathering speech data, should try one's best prevents the interference of ambient noise and voice capture device.Voice capture device
Suitable distance is kept with user, and as far as possible without the big voice capture device of distortion, power supply preferably uses civil power, and keeps electric current
It is stable;Sensor should be used when carrying out telephonograph., can be to voice number before the vocal print feature in extracting speech data
According to carrying out going noise treatment, disturbed with further reduce.In order to extract the vocal print feature for obtaining speech data, gathered
Speech data is the speech data of preset data length, or is the speech data more than preset data length.
Vocal print feature includes polytype, such as broadband vocal print, arrowband vocal print, amplitude vocal print, the vocal print of the present embodiment
Be characterized as preferably speech data mel-frequency cepstrum coefficient (Mel Frequency Cepstrum Coefficient,
MFCC).When building corresponding vocal print feature vector, by the vocal print feature composition characteristic data matrix of speech data, this feature
Data matrix is the vocal print feature vector of speech data.
Module 102 is built, for by the background channel model of vocal print feature vector input training in advance generation, with structure
Build out the corresponding current vocal print discriminant vectorses of the speech data;
Wherein, by the background channel model of vocal print feature vector input training in advance generation, it is preferable that the background channel mould
Type is gauss hybrid models, and vocal print feature vector is calculated using the background channel model, show that corresponding current vocal print differentiates
Vectorial (i.e. i-vector).
Specifically, the calculating process includes:
1) Gauss model, is selected:First, every frame data are calculated using the parameter in common background channel model in difference
The likelihood logarithm value of Gauss model, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally
Obtain a matrix per frame data numerical value in mixed Gauss model:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T,
Wherein, Loglike is likelihood logarithm value matrix, and E (X) is that common background channel model trains the average square come
Battle array, D (X) is covariance matrix, and X is data matrix, X.2Each it is worth for matrix squared.
2) posterior probability, is calculated:X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, three can be reduced to down
Angular moment battle array, and element is arranged as 1 row in order, become the vector that a N frame is multiplied by the lower triangular matrix number latitude
Calculated, the vector of all frames is combined into new data matrix, while the association for probability being calculated in universal background model
Variance matrix, each matrix is also reduced to lower triangular matrix, become with matrix as new data matrix class, passing through common background
Mean Matrix and covariance matrix in channel model calculate the likelihood logarithm under the Gauss model of the selection of every frame data
Value, then carries out Softmax recurrence, operation is finally normalized, every frame is obtained in mixed Gauss model Posterior probability distribution,
The ProbabilityDistribution Vector of every frame is constituted into probability matrix.
3) current vocal print discriminant vectorses, are extracted:Carry out single order first, the calculating of second order coefficient, coefficient of first order is calculated can be with
Obtained by probability matrix row summation:
Wherein, GammaiFor i-th of element of coefficient of first order vector, loglikesjiFor
The jth row of probability matrix, i-th of element.
Second order coefficient can be multiplied by data matrix acquisition by the transposition of probability matrix:
X=LoglikeT* feats, wherein, X is second order coefficient matrix, and loglike is probability matrix, and feats is characterized
Data matrix.
After calculating obtains single order, second order coefficient, then parallel computation first order and quadratic term pass through first order and two
Secondary item calculates current vocal print discriminant vectorses.
Preferably, background channel model is gauss hybrid models, and the system of the authentication based on Application on Voiceprint Recognition also includes:
Second acquisition module, for obtaining the speech data sample of predetermined number, and obtains each speech data sample correspondence
Vocal print feature, and the corresponding vocal print feature of each speech data sample is built based on the corresponding vocal print feature of each speech data sample
Vector;
Division module, for the corresponding vocal print feature vector of each speech data sample is divided into the first ratio training set and
The checking collection of second ratio, first ratio and the second ratio and less than or equal to 1;
Training module, for being trained using the vocal print feature vector in the training set to gauss hybrid models, and
After the completion of training, verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
Processing module, if being more than predetermined threshold value for the accuracy rate, model training terminates, and is mixed with the Gauss after training
Matched moulds type as the background channel model, or, if the accuracy rate be less than or equal to predetermined threshold value, increase the voice number
Training is re-started according to the quantity of sample, and based on the speech data sample after increase.
Wherein, when the vocal print feature vector in using training set is trained to gauss hybrid models, the D extracted
The corresponding likelihood probability of dimension vocal print feature can be expressed as with K Gaussian component:
Wherein, P (x) is the probability (mixing that speech data sample is generated by gauss hybrid models
Gauss model), wkFor the weight of each Gauss model, the probability that p (x | k) generate for sample by k-th of Gauss model, K is high
This model quantity.
The parameter of whole gauss hybrid models can be expressed as:{wi,μi,Σi, wiFor the weight of i-th of Gauss model, μi
For the average of i-th of Gauss model, ∑iFor the covariance of i-th of Gauss model.The gauss hybrid models are trained to use non-prison
The EM algorithms superintended and directed.After the completion of training, the weight vectors of gauss hybrid models, constant vector, N number of covariance matrix, average are obtained
It is multiplied by matrix of covariance etc., the gauss hybrid models after as one training.
First authentication module 103, for the standard vocal print for the user for calculating the current vocal print discriminant vectorses and prestoring
Space length between discriminant vectorses, carries out authentication, and generate the result based on the distance to the user.
Vector has a variety of with the distance between vector, including COS distance and Euclidean distance etc., it is preferable that the present embodiment
Space length be COS distance, COS distance be using two vectorial angle cosine values in vector space be used as measurement two
The measurement of the size of interindividual variation.
Wherein, standard vocal print discriminant vectorses are the vocal print discriminant vectorses for being obtained ahead of time and storing, standard vocal print discriminant vectorses
The identification information of its corresponding user is carried in storage, it is capable of the identity of the corresponding user of accurate representation.Calculating space
Before distance, the identification information provided according to user obtains the vocal print discriminant vectorses of storage.
Wherein, when calculating obtained space length less than or equal to pre-determined distance threshold value, it is verified, conversely, then verifying
Failure.
In a preferred embodiment, on the basis of above-mentioned Fig. 5 embodiment, above-mentioned first acquisition module 101 is specific
For carrying out preemphasis, framing and windowing process to the speech data;Fourier transform is carried out to each adding window to obtain pair
The frequency spectrum answered;The frequency spectrum is inputted into Mel wave filter and obtains Mel frequency spectrum to export;Cepstrum point is carried out on Mel frequency spectrum
Analysis to obtain mel-frequency cepstrum coefficient MFCC, based on the mel-frequency cepstrum coefficient MFCC constitute corresponding vocal print feature to
Amount.
Wherein, preemphasis processing is really high-pass filtering processing, filters out low-frequency data so that the high frequency in speech data is special
Property is more highlighted, and specifically, the transmission function of high-pass filtering is:H (Z)=1- α Z-1, wherein, Z is speech data, and α is constant system
Number, it is preferable that α value is 0.97;Because stationarity is only presented in voice signal within a short period of time, therefore one section of sound is believed
Number it is divided into the signal (i.e. N frames) of N sections of short time, and is lost in order to avoid the continuity Characteristics of sound, has one section between consecutive frame
Repeat region, repeat region is generally 1/2 per frame length;After framing is carried out to speech data, each frame signal is all as flat
Steady signal is handled, but the presence of Gibbs' effect, the start frame and end frame of speech data be it is discontinuous, framing it
Afterwards, more away from raw tone, accordingly, it would be desirable to carry out windowing process to speech data.
Wherein, cepstral analysis is, for example, to take the logarithm, do inverse transformation, and inverse transformation comes generally by DCT discrete cosine transforms
Realize, take the 2nd after DCT to the 13rd coefficient as MFCC coefficients.Mel-frequency cepstrum coefficient MFCC is this frame voice
The vocal print feature of data, by the mel-frequency cepstrum coefficient MFCC composition characteristic data matrixes of every frame, this feature data matrix is
For the vocal print feature vector of speech data.
In a preferred embodiment, on the basis of above-mentioned Fig. 5 embodiment, first authentication module 103 is specific
For calculating the COS distance between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored: For the standard vocal print discriminant vectorses,For current vocal print discriminant vectorses;If the COS distance is less than
Or equal to default distance threshold, then generate the information being verified;If the COS distance is more than default distance threshold,
The information that then generation checking does not pass through.
In a preferred embodiment, on the basis of above-mentioned Fig. 4 embodiment, the first above-mentioned authentication module is replaced with
Second authentication module, for calculating the space between the current vocal print discriminant vectorses and each standard vocal print discriminant vectorses prestored
Distance, obtains minimum space length, carries out authentication to the user based on the minimum space length, and generate checking
As a result.
The present embodiment from unlike Fig. 5 embodiment, the present embodiment store standard vocal print discriminant vectorses when do not take
Identification information with user, when verifying the identity of user, calculates current vocal print discriminant vectorses and each standard vocal print prestored reflects
Space length not between vector, and the space length of minimum is obtained, if the minimum space length is less than default distance
Threshold value (distance threshold identical with the distance threshold of above-described embodiment or difference), then be verified, otherwise authentication failed.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and
Within principle, any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.
Claims (10)
1. a kind of method of the authentication based on Application on Voiceprint Recognition, it is characterised in that the authentication based on Application on Voiceprint Recognition
Method include:
S1, after the speech data for the user for carrying out authentication is received, obtains the vocal print feature of the speech data, and base
Corresponding vocal print feature vector is built in the vocal print feature;
S2, by the background channel model of vocal print feature vector input training in advance generation, to construct the speech data
Corresponding current vocal print discriminant vectorses;
S3, calculate space between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored away from
From carrying out authentication to the user based on the distance, and generate the result.
2. the method for the authentication according to claim 1 based on Application on Voiceprint Recognition, it is characterised in that the step S1 bags
Include:
S11, preemphasis, framing and windowing process are carried out to the speech data;
S12, carries out Fourier transform to each adding window and obtains corresponding frequency spectrum;
S13, inputs Mel wave filter by the frequency spectrum and obtains Mel frequency spectrum to export;
S14, carries out cepstral analysis to obtain mel-frequency cepstrum coefficient MFCC, based on the mel-frequency on Mel frequency spectrum
Cepstrum coefficient MFCC constitutes corresponding vocal print feature vector.
3. the method for the authentication according to claim 1 based on Application on Voiceprint Recognition, it is characterised in that the step S3 bags
Include:
S31, calculate cosine between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored away from
From: For the standard vocal print discriminant vectorses,For current vocal print discriminant vectorses;
S32, if the COS distance is less than or equal to default distance threshold, generates the information being verified;
S33, if the COS distance is more than default distance threshold, the information that generation checking does not pass through.
4. the method for the authentication based on Application on Voiceprint Recognition according to any one of claims 1 to 3, it is characterised in that institute
Background channel model is stated for gauss hybrid models, is included before the step S1:
The speech data sample of predetermined number is obtained, and obtains the corresponding vocal print feature of each speech data sample, and based on each language
The corresponding vocal print feature of sound data sample builds the corresponding vocal print feature vector of each speech data sample;
The corresponding vocal print feature vector of each speech data sample is divided into the training set of the first ratio and the checking collection of the second ratio,
First ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training, utilized
The accuracy rate of gauss hybrid models after the checking set pair training is verified;
If the accuracy rate is more than predetermined threshold value, model training terminates, and the step is used as using the gauss hybrid models after training
Rapid S2 background channel model, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the speech data sample
Quantity, and training is re-started based on the speech data sample after increase.
5. the method for the authentication according to claim 1 or 2 based on Application on Voiceprint Recognition, it is characterised in that the step
S3 is replaced with:The space length between the current vocal print discriminant vectorses and each standard vocal print discriminant vectorses prestored is calculated, is obtained
The space length of minimum is taken, authentication is carried out to the user based on the minimum space length, and generate the result.
6. a kind of system of the authentication based on Application on Voiceprint Recognition, it is characterised in that the authentication based on Application on Voiceprint Recognition
System include:
First acquisition module, for after the speech data for the user for carrying out authentication is received, obtaining the speech data
Vocal print feature, and corresponding vocal print feature vector is built based on the vocal print feature;
Module is built, for by the background channel model of vocal print feature vector input training in advance generation, to construct
State the corresponding current vocal print discriminant vectorses of speech data;
First authentication module, for the standard vocal print discriminant vectorses for the user for calculating the current vocal print discriminant vectorses and prestoring
Between space length, authentication is carried out to the user based on the distance, and the result is generated.
7. the system of the authentication according to claim 6 based on Application on Voiceprint Recognition, it is characterised in that described first obtains
Module to the speech data specifically for carrying out preemphasis, framing and windowing process;Fourier change is carried out to each adding window
Get corresponding frequency spectrum in return;The frequency spectrum is inputted into Mel wave filter and obtains Mel frequency spectrum to export;Enter on Mel frequency spectrum
Row cepstral analysis constitutes corresponding sound to obtain mel-frequency cepstrum coefficient MFCC based on the mel-frequency cepstrum coefficient MFCC
Line characteristic vector.
8. the system of the authentication according to claim 6 based on Application on Voiceprint Recognition, it is characterised in that first checking
Module is specifically for more than calculating between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored
Chordal distance: For the standard vocal print discriminant vectorses,For current vocal print discriminant vectorses;If the cosine
Distance is less than or equal to default distance threshold, then generates the information being verified;If the COS distance is more than default
Distance threshold, the then information that generation checking does not pass through.
9. the system of the authentication based on Application on Voiceprint Recognition according to any one of claim 6 to 8, it is characterised in that institute
Stating the system of the authentication based on Application on Voiceprint Recognition also includes:
Second acquisition module, for obtaining the speech data sample of predetermined number, and obtains the corresponding sound of each speech data sample
Line feature, and based on the corresponding vocal print feature of each speech data sample build the corresponding vocal print feature of each speech data sample to
Amount;
Division module, the training set and second for the corresponding vocal print feature vector of each speech data sample to be divided into the first ratio
The checking collection of ratio, first ratio and the second ratio and less than or equal to 1;
Training module, for being trained using the vocal print feature vector in the training set to gauss hybrid models, and in instruction
After the completion of white silk, verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
Processing module, if being more than predetermined threshold value for the accuracy rate, model training terminates, with the Gaussian Mixture mould after training
Type as the background channel model, or, if the accuracy rate be less than or equal to predetermined threshold value, increase the speech data sample
This quantity, and training is re-started based on the speech data sample after increase.
10. the system of the authentication based on Application on Voiceprint Recognition according to claim 6 or 7, it is characterised in that described first
Authentication module replaces with the second authentication module, and each standard vocal print for calculating the current vocal print discriminant vectorses and prestoring differentiates
Space length between vector, obtains minimum space length, and identity is carried out to the user based on the minimum space length
Checking, and generate the result.
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710147695.XA CN107068154A (en) | 2017-03-13 | 2017-03-13 | The method and system of authentication based on Application on Voiceprint Recognition |
PCT/CN2017/091361 WO2018166112A1 (en) | 2017-03-13 | 2017-06-30 | Voiceprint recognition-based identity verification method, electronic device, and storage medium |
CN201710715433.9A CN107517207A (en) | 2017-03-13 | 2017-08-20 | Server, auth method and computer-readable recording medium |
PCT/CN2017/105031 WO2018166187A1 (en) | 2017-03-13 | 2017-09-30 | Server, identity verification method and system, and a computer-readable storage medium |
TW106135250A TWI641965B (en) | 2017-03-13 | 2017-10-13 | Method and system of authentication based on voiceprint recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710147695.XA CN107068154A (en) | 2017-03-13 | 2017-03-13 | The method and system of authentication based on Application on Voiceprint Recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107068154A true CN107068154A (en) | 2017-08-18 |
Family
ID=59622093
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710147695.XA Pending CN107068154A (en) | 2017-03-13 | 2017-03-13 | The method and system of authentication based on Application on Voiceprint Recognition |
CN201710715433.9A Pending CN107517207A (en) | 2017-03-13 | 2017-08-20 | Server, auth method and computer-readable recording medium |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710715433.9A Pending CN107517207A (en) | 2017-03-13 | 2017-08-20 | Server, auth method and computer-readable recording medium |
Country Status (3)
Country | Link |
---|---|
CN (2) | CN107068154A (en) |
TW (1) | TWI641965B (en) |
WO (2) | WO2018166112A1 (en) |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107993071A (en) * | 2017-11-21 | 2018-05-04 | 平安科技(深圳)有限公司 | Electronic device, auth method and storage medium based on vocal print |
CN108154371A (en) * | 2018-01-12 | 2018-06-12 | 平安科技(深圳)有限公司 | Electronic device, the method for authentication and storage medium |
CN108172230A (en) * | 2018-01-03 | 2018-06-15 | 平安科技(深圳)有限公司 | Voiceprint registration method, terminal installation and storage medium based on Application on Voiceprint Recognition model |
CN108269575A (en) * | 2018-01-12 | 2018-07-10 | 平安科技(深圳)有限公司 | Update audio recognition method, terminal installation and the storage medium of voice print database |
WO2018166187A1 (en) * | 2017-03-13 | 2018-09-20 | 平安科技(深圳)有限公司 | Server, identity verification method and system, and a computer-readable storage medium |
CN108766444A (en) * | 2018-04-09 | 2018-11-06 | 平安科技(深圳)有限公司 | User ID authentication method, server and storage medium |
CN108806695A (en) * | 2018-04-17 | 2018-11-13 | 平安科技(深圳)有限公司 | Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh |
CN109101801A (en) * | 2018-07-12 | 2018-12-28 | 北京百度网讯科技有限公司 | Method for identity verification, device, equipment and computer readable storage medium |
CN109256138A (en) * | 2018-08-13 | 2019-01-22 | 平安科技(深圳)有限公司 | Auth method, terminal device and computer readable storage medium |
CN109257362A (en) * | 2018-10-11 | 2019-01-22 | 平安科技(深圳)有限公司 | Method, apparatus, computer equipment and the storage medium of voice print verification |
WO2019019256A1 (en) * | 2017-07-25 | 2019-01-31 | 平安科技(深圳)有限公司 | Electronic apparatus, identity verification method and system, and computer-readable storage medium |
CN109360573A (en) * | 2018-11-13 | 2019-02-19 | 平安科技(深圳)有限公司 | Livestock method for recognizing sound-groove, device, terminal device and computer storage medium |
CN109378002A (en) * | 2018-10-11 | 2019-02-22 | 平安科技(深圳)有限公司 | Method, apparatus, computer equipment and the storage medium of voice print verification |
CN109377662A (en) * | 2018-09-29 | 2019-02-22 | 途客易达(天津)网络科技有限公司 | Charging pile control method, device and electronic equipment |
CN109473105A (en) * | 2018-10-26 | 2019-03-15 | 平安科技(深圳)有限公司 | The voice print verification method, apparatus unrelated with text and computer equipment |
CN109493873A (en) * | 2018-11-13 | 2019-03-19 | 平安科技(深圳)有限公司 | Livestock method for recognizing sound-groove, device, terminal device and computer storage medium |
CN109524026A (en) * | 2018-10-26 | 2019-03-26 | 北京网众共创科技有限公司 | The determination method and device of prompt tone, storage medium, electronic device |
CN109636630A (en) * | 2018-12-07 | 2019-04-16 | 泰康保险集团股份有限公司 | Method, apparatus, medium and electronic equipment of the detection for behavior of insuring |
CN110298150A (en) * | 2019-05-29 | 2019-10-01 | 上海拍拍贷金融信息服务有限公司 | A kind of auth method and system based on speech recognition |
CN110473569A (en) * | 2019-09-11 | 2019-11-19 | 苏州思必驰信息科技有限公司 | Detect the optimization method and system of speaker's spoofing attack |
CN110738998A (en) * | 2019-09-11 | 2020-01-31 | 深圳壹账通智能科技有限公司 | Voice-based personal credit evaluation method, device, terminal and storage medium |
CN110867189A (en) * | 2018-08-28 | 2020-03-06 | 北京京东尚科信息技术有限公司 | Login method and device |
CN110880325A (en) * | 2018-09-05 | 2020-03-13 | 华为技术有限公司 | Identity recognition method and equipment |
CN110971755A (en) * | 2019-11-18 | 2020-04-07 | 武汉大学 | Double-factor identity authentication method based on PIN code and pressure code |
CN111402899A (en) * | 2020-03-25 | 2020-07-10 | 中国工商银行股份有限公司 | Cross-channel voiceprint identification method and device |
CN111625704A (en) * | 2020-05-11 | 2020-09-04 | 镇江纵陌阡横信息科技有限公司 | Non-personalized recommendation algorithm model based on user intention and data cooperation |
CN111899566A (en) * | 2020-08-11 | 2020-11-06 | 南京畅淼科技有限责任公司 | Ship traffic management system based on AIS |
CN112289324A (en) * | 2020-10-27 | 2021-01-29 | 湖南华威金安企业管理有限公司 | Voiceprint identity recognition method and device and electronic equipment |
CN112802481A (en) * | 2021-04-06 | 2021-05-14 | 北京远鉴信息技术有限公司 | Voiceprint verification method, voiceprint recognition model training method, device and equipment |
CN113421575A (en) * | 2021-06-30 | 2021-09-21 | 平安科技(深圳)有限公司 | Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium |
CN114780787A (en) * | 2022-04-01 | 2022-07-22 | 杭州半云科技有限公司 | Voiceprint retrieval method, identity verification method, identity registration method and device |
CN114826709A (en) * | 2022-04-15 | 2022-07-29 | 马上消费金融股份有限公司 | Identity authentication and acoustic environment detection method, system, electronic device and medium |
Families Citing this family (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108091326B (en) * | 2018-02-11 | 2021-08-06 | 张晓雷 | Voiceprint recognition method and system based on linear regression |
CN108768654B (en) * | 2018-04-09 | 2020-04-21 | 平安科技(深圳)有限公司 | Identity verification method based on voiceprint recognition, server and storage medium |
CN108694952B (en) * | 2018-04-09 | 2020-04-28 | 平安科技(深圳)有限公司 | Electronic device, identity authentication method and storage medium |
CN108447489B (en) * | 2018-04-17 | 2020-05-22 | 清华大学 | Continuous voiceprint authentication method and system with feedback |
CN108630208B (en) * | 2018-05-14 | 2020-10-27 | 平安科技(深圳)有限公司 | Server, voiceprint-based identity authentication method and storage medium |
CN108650266B (en) * | 2018-05-14 | 2020-02-18 | 平安科技(深圳)有限公司 | Server, voiceprint verification method and storage medium |
CN108834138B (en) * | 2018-05-25 | 2022-05-24 | 北京国联视讯信息技术股份有限公司 | Network distribution method and system based on voiceprint data |
CN109087647B (en) * | 2018-08-03 | 2023-06-13 | 平安科技(深圳)有限公司 | Voiceprint recognition processing method and device, electronic equipment and storage medium |
CN109450850B (en) * | 2018-09-26 | 2022-10-11 | 深圳壹账通智能科技有限公司 | Identity authentication method, identity authentication device, computer equipment and storage medium |
CN109147797B (en) * | 2018-10-18 | 2024-05-07 | 平安科技(深圳)有限公司 | Customer service method, device, computer equipment and storage medium based on voiceprint recognition |
CN110046910B (en) * | 2018-12-13 | 2023-04-14 | 蚂蚁金服(杭州)网络技术有限公司 | Method and equipment for judging validity of transaction performed by customer through electronic payment platform |
CN109816508A (en) * | 2018-12-14 | 2019-05-28 | 深圳壹账通智能科技有限公司 | Method for authenticating user identity, device based on big data, computer equipment |
CN109473108A (en) * | 2018-12-15 | 2019-03-15 | 深圳壹账通智能科技有限公司 | Auth method, device, equipment and storage medium based on Application on Voiceprint Recognition |
CN109545226B (en) * | 2019-01-04 | 2022-11-22 | 平安科技(深圳)有限公司 | Voice recognition method, device and computer readable storage medium |
CN110322888B (en) * | 2019-05-21 | 2023-05-30 | 平安科技(深圳)有限公司 | Credit card unlocking method, apparatus, device and computer readable storage medium |
CN110334603A (en) * | 2019-06-06 | 2019-10-15 | 视联动力信息技术股份有限公司 | Authentication system |
CN111597531A (en) * | 2020-04-07 | 2020-08-28 | 北京捷通华声科技股份有限公司 | Identity authentication method and device, electronic equipment and readable storage medium |
CN111710340A (en) * | 2020-06-05 | 2020-09-25 | 深圳市卡牛科技有限公司 | Method, device, server and storage medium for identifying user identity based on voice |
CN111613230A (en) * | 2020-06-24 | 2020-09-01 | 泰康保险集团股份有限公司 | Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium |
CN112669841B (en) * | 2020-12-18 | 2024-07-02 | 平安科技(深圳)有限公司 | Training method and device for generating model of multilingual voice and computer equipment |
CN113889120A (en) * | 2021-09-28 | 2022-01-04 | 北京百度网讯科技有限公司 | Voiceprint feature extraction method and device, electronic equipment and storage medium |
Family Cites Families (35)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8645137B2 (en) * | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
CN1170239C (en) * | 2002-09-06 | 2004-10-06 | 浙江大学 | Palm acoustic-print verifying system |
TWI234762B (en) * | 2003-12-22 | 2005-06-21 | Top Dihital Co Ltd | Voiceprint identification system for e-commerce |
US7447633B2 (en) * | 2004-11-22 | 2008-11-04 | International Business Machines Corporation | Method and apparatus for training a text independent speaker recognition system using speech data with text labels |
US7536304B2 (en) * | 2005-05-27 | 2009-05-19 | Porticus, Inc. | Method and system for bio-metric voice print authentication |
CN101064043A (en) * | 2006-04-29 | 2007-10-31 | 上海优浪信息科技有限公司 | Sound-groove gate inhibition system and uses thereof |
CN102479511A (en) * | 2010-11-23 | 2012-05-30 | 盛乐信息技术(上海)有限公司 | Large-scale voiceprint authentication method and system |
TW201301261A (en) * | 2011-06-27 | 2013-01-01 | Hon Hai Prec Ind Co Ltd | Identity authentication system and method thereof |
CN102238190B (en) * | 2011-08-01 | 2013-12-11 | 安徽科大讯飞信息科技股份有限公司 | Identity authentication method and system |
CN102509547B (en) * | 2011-12-29 | 2013-06-19 | 辽宁工业大学 | Method and system for voiceprint recognition based on vector quantization based |
US9042867B2 (en) * | 2012-02-24 | 2015-05-26 | Agnitio S.L. | System and method for speaker recognition on mobile devices |
CN102695112A (en) * | 2012-06-09 | 2012-09-26 | 九江妙士酷实业有限公司 | Automobile player and volume control method thereof |
CN102820033B (en) * | 2012-08-17 | 2013-12-04 | 南京大学 | Voiceprint identification method |
CN102916815A (en) * | 2012-11-07 | 2013-02-06 | 华为终端有限公司 | Method and device for checking identity of user |
CN103220286B (en) * | 2013-04-10 | 2015-02-25 | 郑方 | Identity verification system and identity verification method based on dynamic password voice |
CN104427076A (en) * | 2013-08-30 | 2015-03-18 | 中兴通讯股份有限公司 | Recognition method and recognition device for automatic answering of calling system |
CN103632504A (en) * | 2013-12-17 | 2014-03-12 | 上海电机学院 | Silence reminder for library |
CN104765996B (en) * | 2014-01-06 | 2018-04-27 | 讯飞智元信息科技有限公司 | Voiceprint password authentication method and system |
CN104978507B (en) * | 2014-04-14 | 2019-02-01 | 中国石油化工集团公司 | A kind of Intelligent controller for logging evaluation expert system identity identifying method based on Application on Voiceprint Recognition |
CN105100911A (en) * | 2014-05-06 | 2015-11-25 | 夏普株式会社 | Intelligent multimedia system and method |
CN103986725A (en) * | 2014-05-29 | 2014-08-13 | 中国农业银行股份有限公司 | Client side, server side and identity authentication system and method |
CN104157301A (en) * | 2014-07-25 | 2014-11-19 | 广州三星通信技术研究有限公司 | Method, device and terminal deleting voice information blank segment |
CN105321293A (en) * | 2014-09-18 | 2016-02-10 | 广东小天才科技有限公司 | Danger detection reminding method and intelligent equipment |
CN104485102A (en) * | 2014-12-23 | 2015-04-01 | 智慧眼(湖南)科技发展有限公司 | Voiceprint recognition method and device |
CN104751845A (en) * | 2015-03-31 | 2015-07-01 | 江苏久祥汽车电器集团有限公司 | Voice recognition method and system used for intelligent robot |
CN104992708B (en) * | 2015-05-11 | 2018-07-24 | 国家计算机网络与信息安全管理中心 | Specific audio detection model generation in short-term and detection method |
CN105096955B (en) * | 2015-09-06 | 2019-02-01 | 广东外语外贸大学 | A kind of speaker's method for quickly identifying and system based on model growth cluster |
CN105611461B (en) * | 2016-01-04 | 2019-12-17 | 浙江宇视科技有限公司 | Noise suppression method, device and system for front-end equipment voice application system |
CN105575394A (en) * | 2016-01-04 | 2016-05-11 | 北京时代瑞朗科技有限公司 | Voiceprint identification method based on global change space and deep learning hybrid modeling |
CN106971717A (en) * | 2016-01-14 | 2017-07-21 | 芋头科技(杭州)有限公司 | Robot and audio recognition method, the device of webserver collaborative process |
CN105869645B (en) * | 2016-03-25 | 2019-04-12 | 腾讯科技(深圳)有限公司 | Voice data processing method and device |
CN106210323B (en) * | 2016-07-13 | 2019-09-24 | Oppo广东移动通信有限公司 | A kind of speech playing method and terminal device |
CN106169295B (en) * | 2016-07-15 | 2019-03-01 | 腾讯科技(深圳)有限公司 | Identity vector generation method and device |
CN106373576B (en) * | 2016-09-07 | 2020-07-21 | Tcl科技集团股份有限公司 | Speaker confirmation method and system based on VQ and SVM algorithms |
CN107068154A (en) * | 2017-03-13 | 2017-08-18 | 平安科技(深圳)有限公司 | The method and system of authentication based on Application on Voiceprint Recognition |
-
2017
- 2017-03-13 CN CN201710147695.XA patent/CN107068154A/en active Pending
- 2017-06-30 WO PCT/CN2017/091361 patent/WO2018166112A1/en active Application Filing
- 2017-08-20 CN CN201710715433.9A patent/CN107517207A/en active Pending
- 2017-09-30 WO PCT/CN2017/105031 patent/WO2018166187A1/en active Application Filing
- 2017-10-13 TW TW106135250A patent/TWI641965B/en active
Cited By (48)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018166187A1 (en) * | 2017-03-13 | 2018-09-20 | 平安科技(深圳)有限公司 | Server, identity verification method and system, and a computer-readable storage medium |
WO2019019256A1 (en) * | 2017-07-25 | 2019-01-31 | 平安科技(深圳)有限公司 | Electronic apparatus, identity verification method and system, and computer-readable storage medium |
US11068571B2 (en) | 2017-07-25 | 2021-07-20 | Ping An Technology (Shenzhen) Co., Ltd. | Electronic device, method and system of identity verification and computer readable storage medium |
WO2019100606A1 (en) * | 2017-11-21 | 2019-05-31 | 平安科技(深圳)有限公司 | Electronic device, voiceprint-based identity verification method and system, and storage medium |
CN107993071A (en) * | 2017-11-21 | 2018-05-04 | 平安科技(深圳)有限公司 | Electronic device, auth method and storage medium based on vocal print |
CN108172230A (en) * | 2018-01-03 | 2018-06-15 | 平安科技(深圳)有限公司 | Voiceprint registration method, terminal installation and storage medium based on Application on Voiceprint Recognition model |
CN108269575B (en) * | 2018-01-12 | 2021-11-02 | 平安科技(深圳)有限公司 | Voice recognition method for updating voiceprint data, terminal device and storage medium |
WO2019136912A1 (en) * | 2018-01-12 | 2019-07-18 | 平安科技(深圳)有限公司 | Electronic device, identity authentication method and system, and storage medium |
CN108269575A (en) * | 2018-01-12 | 2018-07-10 | 平安科技(深圳)有限公司 | Update audio recognition method, terminal installation and the storage medium of voice print database |
CN108154371A (en) * | 2018-01-12 | 2018-06-12 | 平安科技(深圳)有限公司 | Electronic device, the method for authentication and storage medium |
CN108766444A (en) * | 2018-04-09 | 2018-11-06 | 平安科技(深圳)有限公司 | User ID authentication method, server and storage medium |
CN108766444B (en) * | 2018-04-09 | 2020-11-03 | 平安科技(深圳)有限公司 | User identity authentication method, server and storage medium |
CN108806695A (en) * | 2018-04-17 | 2018-11-13 | 平安科技(深圳)有限公司 | Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh |
WO2019200744A1 (en) * | 2018-04-17 | 2019-10-24 | 平安科技(深圳)有限公司 | Self-updated anti-fraud method and apparatus, computer device and storage medium |
CN109101801A (en) * | 2018-07-12 | 2018-12-28 | 北京百度网讯科技有限公司 | Method for identity verification, device, equipment and computer readable storage medium |
US11294995B2 (en) | 2018-07-12 | 2022-04-05 | Beijing Baidu Netcom Science Technology Co., Ltd. | Method and apparatus for identity authentication, and computer readable storage medium |
CN109101801B (en) * | 2018-07-12 | 2021-04-27 | 北京百度网讯科技有限公司 | Method, apparatus, device and computer readable storage medium for identity authentication |
CN109256138A (en) * | 2018-08-13 | 2019-01-22 | 平安科技(深圳)有限公司 | Auth method, terminal device and computer readable storage medium |
CN109256138B (en) * | 2018-08-13 | 2023-07-07 | 平安科技(深圳)有限公司 | Identity verification method, terminal device and computer readable storage medium |
CN110867189A (en) * | 2018-08-28 | 2020-03-06 | 北京京东尚科信息技术有限公司 | Login method and device |
CN110880325B (en) * | 2018-09-05 | 2022-06-28 | 华为技术有限公司 | Identity recognition method and equipment |
CN110880325A (en) * | 2018-09-05 | 2020-03-13 | 华为技术有限公司 | Identity recognition method and equipment |
CN109377662A (en) * | 2018-09-29 | 2019-02-22 | 途客易达(天津)网络科技有限公司 | Charging pile control method, device and electronic equipment |
CN109378002B (en) * | 2018-10-11 | 2024-05-07 | 平安科技(深圳)有限公司 | Voiceprint verification method, voiceprint verification device, computer equipment and storage medium |
CN109378002A (en) * | 2018-10-11 | 2019-02-22 | 平安科技(深圳)有限公司 | Method, apparatus, computer equipment and the storage medium of voice print verification |
CN109257362A (en) * | 2018-10-11 | 2019-01-22 | 平安科技(深圳)有限公司 | Method, apparatus, computer equipment and the storage medium of voice print verification |
CN109524026A (en) * | 2018-10-26 | 2019-03-26 | 北京网众共创科技有限公司 | The determination method and device of prompt tone, storage medium, electronic device |
CN109473105A (en) * | 2018-10-26 | 2019-03-15 | 平安科技(深圳)有限公司 | The voice print verification method, apparatus unrelated with text and computer equipment |
CN109524026B (en) * | 2018-10-26 | 2022-04-26 | 北京网众共创科技有限公司 | Method and device for determining prompt tone, storage medium and electronic device |
CN109493873A (en) * | 2018-11-13 | 2019-03-19 | 平安科技(深圳)有限公司 | Livestock method for recognizing sound-groove, device, terminal device and computer storage medium |
CN109360573A (en) * | 2018-11-13 | 2019-02-19 | 平安科技(深圳)有限公司 | Livestock method for recognizing sound-groove, device, terminal device and computer storage medium |
CN109636630A (en) * | 2018-12-07 | 2019-04-16 | 泰康保险集团股份有限公司 | Method, apparatus, medium and electronic equipment of the detection for behavior of insuring |
CN110298150A (en) * | 2019-05-29 | 2019-10-01 | 上海拍拍贷金融信息服务有限公司 | A kind of auth method and system based on speech recognition |
CN110298150B (en) * | 2019-05-29 | 2021-11-26 | 上海拍拍贷金融信息服务有限公司 | Identity verification method and system based on voice recognition |
CN110738998A (en) * | 2019-09-11 | 2020-01-31 | 深圳壹账通智能科技有限公司 | Voice-based personal credit evaluation method, device, terminal and storage medium |
CN110473569A (en) * | 2019-09-11 | 2019-11-19 | 苏州思必驰信息科技有限公司 | Detect the optimization method and system of speaker's spoofing attack |
CN110971755A (en) * | 2019-11-18 | 2020-04-07 | 武汉大学 | Double-factor identity authentication method based on PIN code and pressure code |
CN111402899A (en) * | 2020-03-25 | 2020-07-10 | 中国工商银行股份有限公司 | Cross-channel voiceprint identification method and device |
CN111402899B (en) * | 2020-03-25 | 2023-10-13 | 中国工商银行股份有限公司 | Cross-channel voiceprint recognition method and device |
CN111625704A (en) * | 2020-05-11 | 2020-09-04 | 镇江纵陌阡横信息科技有限公司 | Non-personalized recommendation algorithm model based on user intention and data cooperation |
CN111899566A (en) * | 2020-08-11 | 2020-11-06 | 南京畅淼科技有限责任公司 | Ship traffic management system based on AIS |
CN112289324A (en) * | 2020-10-27 | 2021-01-29 | 湖南华威金安企业管理有限公司 | Voiceprint identity recognition method and device and electronic equipment |
CN112289324B (en) * | 2020-10-27 | 2024-05-10 | 湖南华威金安企业管理有限公司 | Voiceprint identity recognition method and device and electronic equipment |
CN112802481A (en) * | 2021-04-06 | 2021-05-14 | 北京远鉴信息技术有限公司 | Voiceprint verification method, voiceprint recognition model training method, device and equipment |
CN113421575A (en) * | 2021-06-30 | 2021-09-21 | 平安科技(深圳)有限公司 | Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium |
CN113421575B (en) * | 2021-06-30 | 2024-02-06 | 平安科技(深圳)有限公司 | Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium |
CN114780787A (en) * | 2022-04-01 | 2022-07-22 | 杭州半云科技有限公司 | Voiceprint retrieval method, identity verification method, identity registration method and device |
CN114826709A (en) * | 2022-04-15 | 2022-07-29 | 马上消费金融股份有限公司 | Identity authentication and acoustic environment detection method, system, electronic device and medium |
Also Published As
Publication number | Publication date |
---|---|
WO2018166112A1 (en) | 2018-09-20 |
TW201833810A (en) | 2018-09-16 |
WO2018166187A1 (en) | 2018-09-20 |
TWI641965B (en) | 2018-11-21 |
CN107517207A (en) | 2017-12-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107068154A (en) | The method and system of authentication based on Application on Voiceprint Recognition | |
CN107527620B (en) | Electronic device, the method for authentication and computer readable storage medium | |
CN110457432B (en) | Interview scoring method, interview scoring device, interview scoring equipment and interview scoring storage medium | |
CN112562691B (en) | Voiceprint recognition method, voiceprint recognition device, computer equipment and storage medium | |
CN102737633B (en) | Method and device for recognizing speaker based on tensor subspace analysis | |
CN109524014A (en) | A kind of Application on Voiceprint Recognition analysis method based on depth convolutional neural networks | |
CN107993071A (en) | Electronic device, auth method and storage medium based on vocal print | |
CN107610707A (en) | A kind of method for recognizing sound-groove and device | |
CN111694940B (en) | User report generation method and terminal equipment | |
CN106847292A (en) | Method for recognizing sound-groove and device | |
CN111933154B (en) | Method, equipment and computer readable storage medium for recognizing fake voice | |
CN108154371A (en) | Electronic device, the method for authentication and storage medium | |
CN105096955A (en) | Speaker rapid identification method and system based on growing and clustering algorithm of models | |
CN110473552A (en) | Speech recognition authentication method and system | |
CN116153337B (en) | Synthetic voice tracing evidence obtaining method and device, electronic equipment and storage medium | |
CN115083422B (en) | Voice traceability evidence obtaining method and device, equipment and storage medium | |
CN109545226A (en) | A kind of audio recognition method, equipment and computer readable storage medium | |
Zhang et al. | Temporal Transformer Networks for Acoustic Scene Classification. | |
CN113793620B (en) | Voice noise reduction method, device and equipment based on scene classification and storage medium | |
CN108650266A (en) | Server, the method for voice print verification and storage medium | |
CN108630208A (en) | Server, auth method and storage medium based on vocal print | |
CN117253490A (en) | Conformer-based speaker verification method and system | |
CN110047491A (en) | A kind of relevant method for distinguishing speek person of random digit password and device | |
CN113870887A (en) | Single-channel speech enhancement method and device, computer equipment and storage medium | |
CN117935813B (en) | Voiceprint recognition method and voiceprint recognition system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170818 |
|
WD01 | Invention patent application deemed withdrawn after publication |