Nothing Special   »   [go: up one dir, main page]

CN107068154A - The method and system of authentication based on Application on Voiceprint Recognition - Google Patents

The method and system of authentication based on Application on Voiceprint Recognition Download PDF

Info

Publication number
CN107068154A
CN107068154A CN201710147695.XA CN201710147695A CN107068154A CN 107068154 A CN107068154 A CN 107068154A CN 201710147695 A CN201710147695 A CN 201710147695A CN 107068154 A CN107068154 A CN 107068154A
Authority
CN
China
Prior art keywords
vocal print
speech data
authentication
print feature
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710147695.XA
Other languages
Chinese (zh)
Inventor
王健宗
丁涵宇
郭卉
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201710147695.XA priority Critical patent/CN107068154A/en
Priority to PCT/CN2017/091361 priority patent/WO2018166112A1/en
Publication of CN107068154A publication Critical patent/CN107068154A/en
Priority to CN201710715433.9A priority patent/CN107517207A/en
Priority to PCT/CN2017/105031 priority patent/WO2018166187A1/en
Priority to TW106135250A priority patent/TWI641965B/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0861Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Biomedical Technology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Business, Economics & Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Game Theory and Decision Science (AREA)
  • Collating Specific Patterns (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention relates to a kind of method and system of the authentication based on Application on Voiceprint Recognition, the method for the authentication based on Application on Voiceprint Recognition includes:After the speech data for the user for carrying out authentication is received, the vocal print feature of the speech data is obtained, and corresponding vocal print feature vector is built based on the vocal print feature;By the background channel model of vocal print feature vector input training in advance generation, to construct the corresponding current vocal print discriminant vectorses of the speech data;The space length between the standard vocal print discriminant vectorses of the current vocal print discriminant vectorses and the user prestored is calculated, authentication is carried out to the user based on the distance, and generate the result.The present invention can improve the accuracy rate and efficiency of subscriber authentication.

Description

The method and system of authentication based on Application on Voiceprint Recognition
Technical field
The present invention relates to communication technical field, more particularly to a kind of authentication based on Application on Voiceprint Recognition method and be System.
Background technology
At present, the scope of business of large-scale financing corporation is related to multiple business such as insurance, bank, investment, each business Category is generally required for same client to be linked up, and the mode of communication has a variety of (such as telephonic communications or communication face-to-face). Before being linked up, the identity to client carries out checking as the important component for ensureing service security.In order to meet industry The real-time demand of business, financing corporation generally carries out analysis checking using manual type to the identity of client.Due to customer group Huge, by artificial progress discriminant analysis, accuracy is not also high in the way of the identity to verifying client, and efficiency is also low.
The content of the invention
It is an object of the invention to provide a kind of method and system of the authentication based on Application on Voiceprint Recognition, it is intended to improves and uses The accuracy rate and efficiency of family authentication.
To achieve the above object, the present invention provides a kind of method of the authentication based on Application on Voiceprint Recognition, described to be based on sound The method of the authentication of line identification includes:
S1, after the speech data for the user for carrying out authentication is received, obtains the vocal print feature of the speech data, And corresponding vocal print feature vector is built based on the vocal print feature;
S2, by the background channel model of vocal print feature vector input training in advance generation, to construct the voice The corresponding current vocal print discriminant vectorses of data;
S3, calculates the space between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored Distance, carries out authentication, and generate the result based on the distance to the user.
Preferably, the step S1 includes:
S11, preemphasis, framing and windowing process are carried out to the speech data;
S12, carries out Fourier transform to each adding window and obtains corresponding frequency spectrum;
S13, inputs Mel wave filter by the frequency spectrum and obtains Mel frequency spectrum to export;
S14, carries out cepstral analysis to obtain mel-frequency cepstrum coefficient MFCC, based on the Mel on Mel frequency spectrum Frequency cepstral coefficient MFCC constitutes corresponding vocal print feature vector.
Preferably, the step S3 includes:
S31, calculates remaining between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored Chordal distance: For the standard vocal print discriminant vectorses,For current vocal print discriminant vectorses;
S32, if the COS distance is less than or equal to default distance threshold, generates the information being verified;
S33, if the COS distance is more than default distance threshold, the information that generation checking does not pass through.
Preferably, the background channel model includes before being gauss hybrid models, the step S1:
The speech data sample of predetermined number is obtained, and obtains the corresponding vocal print feature of each speech data sample, and is based on The corresponding vocal print feature vector of each speech data sample of each corresponding vocal print feature structure of speech data sample;
The corresponding vocal print feature vector of each speech data sample is divided into the training set of the first ratio and testing for the second ratio Card collection, first ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training, Verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
If the accuracy rate is more than predetermined threshold value, model training terminates, and institute is used as using the gauss hybrid models after training Step S2 background channel model is stated, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the speech data sample This quantity, and training is re-started based on the speech data sample after increase.
Preferably, the step S3 is replaced with:Each standard vocal print for calculating the current vocal print discriminant vectorses and prestoring reflects Space length not between vector, obtains minimum space length, and body is carried out to the user based on the minimum space length Part checking, and generate the result.
To achieve the above object, the present invention also provides a kind of system of the authentication based on Application on Voiceprint Recognition, described to be based on The system of the authentication of Application on Voiceprint Recognition includes:
First acquisition module, for after the speech data for the user for carrying out authentication is received, obtaining the voice The vocal print feature of data, and corresponding vocal print feature vector is built based on the vocal print feature;
Module is built, for by the background channel model of vocal print feature vector input training in advance generation, to build Go out the corresponding current vocal print discriminant vectorses of the speech data;
First authentication module, the standard vocal print of the user for calculating the current vocal print discriminant vectorses and prestoring differentiates Space length between vector, carries out authentication, and generate the result based on the distance to the user.
Preferably, first acquisition module to the speech data specifically for carrying out at preemphasis, framing and adding window Reason;Fourier transform is carried out to each adding window and obtains corresponding frequency spectrum;The frequency spectrum is inputted Mel wave filter to export To Mel frequency spectrum;Cepstral analysis is carried out on Mel frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, based on the Mel Frequency cepstral coefficient MFCC constitutes corresponding vocal print feature vector.
Preferably, first authentication module is specifically for the calculating current vocal print discriminant vectorses and the user prestored Standard vocal print discriminant vectorses between COS distance: For the standard vocal print discriminant vectorses,For Current vocal print discriminant vectorses;If the COS distance is less than or equal to default distance threshold, the letter being verified is generated Breath;If the COS distance is more than default distance threshold, the information that generation checking does not pass through.
Preferably, the system of the authentication based on Application on Voiceprint Recognition also includes:
Second acquisition module, for obtaining the speech data sample of predetermined number, and obtains each speech data sample correspondence Vocal print feature, and the corresponding vocal print feature of each speech data sample is built based on the corresponding vocal print feature of each speech data sample Vector;
Division module, for the corresponding vocal print feature vector of each speech data sample is divided into the first ratio training set and The checking collection of second ratio, first ratio and the second ratio and less than or equal to 1;
Training module, for being trained using the vocal print feature vector in the training set to gauss hybrid models, and After the completion of training, verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
Processing module, if being more than predetermined threshold value for the accuracy rate, model training terminates, and is mixed with the Gauss after training Matched moulds type as the background channel model, or, if the accuracy rate be less than or equal to predetermined threshold value, increase the voice number Training is re-started according to the quantity of sample, and based on the speech data sample after increase.
Preferably, first authentication module replaces with the second authentication module, for calculate the current vocal print differentiate to Space length between each standard vocal print discriminant vectorses measured and prestored, obtains minimum space length, based on described minimum Space length carries out authentication to the user, and generates the result.
The beneficial effects of the invention are as follows:The background channel model of training in advance generation of the present invention is by a large amount of voice numbers According to excavation obtained with comparing training, this model can be accurate to carve while the vocal print feature of user is retained to greatest extent Background vocal print feature when user speaks is drawn, and can be removed this feature in identification, and extracts the intrinsic of user voice Feature, can significantly improve the accuracy rate of subscriber authentication, and improve the efficiency of authentication.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the method preferred embodiment of the authentication of the invention based on Application on Voiceprint Recognition;
Fig. 2 is the refinement schematic flow sheet of step S1 shown in Fig. 1;
Fig. 3 is the refinement schematic flow sheet of step S3 shown in Fig. 1;
Fig. 4 is the running environment schematic diagram of the system preferred embodiment of the authentication of the invention based on Application on Voiceprint Recognition;
Fig. 5 is the structural representation of the system preferred embodiment of the authentication of the invention based on Application on Voiceprint Recognition.
Embodiment
The principle and feature of the present invention are described below in conjunction with accompanying drawing, the given examples are served only to explain the present invention, and It is non-to be used to limit the scope of the present invention.
As shown in figure 1, Fig. 1 is the flow signal of the embodiment of method one of the authentication of the invention based on Application on Voiceprint Recognition Figure, being somebody's turn to do the method for the authentication based on Application on Voiceprint Recognition can be performed by the system of an authentication based on Application on Voiceprint Recognition, should System can realize by software and/or hardware, and the system can it is integrated in the server.The identity based on Application on Voiceprint Recognition is tested The method of card comprises the following steps:
Step S1, after the speech data for the user for carrying out authentication is received, obtains the vocal print of the speech data Feature, and corresponding vocal print feature vector is built based on the vocal print feature;
In the present embodiment, speech data collects (voice capture device is, for example, microphone) by voice capture device, The system that the speech data of collection is sent to the authentication based on Application on Voiceprint Recognition by voice capture device.
When gathering speech data, should try one's best prevents the interference of ambient noise and voice capture device.Voice capture device Suitable distance is kept with user, and as far as possible without the big voice capture device of distortion, power supply preferably uses civil power, and keeps electric current It is stable;Sensor should be used when carrying out telephonograph., can be to voice number before the vocal print feature in extracting speech data According to carrying out going noise treatment, disturbed with further reduce.In order to extract the vocal print feature for obtaining speech data, gathered Speech data is the speech data of preset data length, or is the speech data more than preset data length.
Vocal print feature includes polytype, such as broadband vocal print, arrowband vocal print, amplitude vocal print, the vocal print of the present embodiment Be characterized as preferably speech data mel-frequency cepstrum coefficient (Mel Frequency Cepstrum Coefficient, MFCC).When building corresponding vocal print feature vector, by the vocal print feature composition characteristic data matrix of speech data, this feature Data matrix is the vocal print feature vector of speech data.
Step S2, by the background channel model of vocal print feature vector input training in advance generation, described in constructing The corresponding current vocal print discriminant vectorses of speech data;
Wherein, by the background channel model of vocal print feature vector input training in advance generation, it is preferable that the background channel mould Type is gauss hybrid models, and vocal print feature vector is calculated using the background channel model, show that corresponding current vocal print differentiates Vectorial (i.e. i-vector).
Specifically, the calculating process includes:
1) Gauss model, is selected:First, every frame data are calculated using the parameter in common background channel model in difference The likelihood logarithm value of Gauss model, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally Obtain a matrix per frame data numerical value in mixed Gauss model:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T,
Wherein, Loglike is likelihood logarithm value matrix, and E (X) is that common background channel model trains the average square come Battle array, D (X) is covariance matrix, and X is data matrix, X.2Each it is worth for matrix squared.
2) posterior probability, is calculated:X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, three can be reduced to down Angular moment battle array, and element is arranged as 1 row in order, become the vector that a N frame is multiplied by the lower triangular matrix number latitude Calculated, the vector of all frames is combined into new data matrix, while the association for probability being calculated in universal background model Variance matrix, each matrix is also reduced to lower triangular matrix, become with matrix as new data matrix class, passing through common background Mean Matrix and covariance matrix in channel model calculate the likelihood logarithm under the Gauss model of the selection of every frame data Value, then carries out Softmax recurrence, operation is finally normalized, every frame is obtained in mixed Gauss model Posterior probability distribution, The ProbabilityDistribution Vector of every frame is constituted into probability matrix.
3) current vocal print discriminant vectorses, are extracted:Carry out single order first, the calculating of second order coefficient, coefficient of first order is calculated can be with Obtained by probability matrix row summation:
Wherein, GammaiFor i-th of element of coefficient of first order vector, loglikesjiFor The jth row of probability matrix, i-th of element.
Second order coefficient can be multiplied by data matrix acquisition by the transposition of probability matrix:
X=LoglikeT* feats, wherein, X is second order coefficient matrix, and loglike is probability matrix, and feats is characterized Data matrix.
After calculating obtains single order, second order coefficient, then parallel computation first order and quadratic term pass through first order and two Secondary item calculates current vocal print discriminant vectorses.
Preferably, background channel model is gauss hybrid models, is included before above-mentioned steps S1:
The speech data sample of predetermined number is obtained, and obtains the corresponding vocal print feature of each speech data sample, and is based on The corresponding vocal print feature vector of each speech data sample of each corresponding vocal print feature structure of speech data sample;
The corresponding vocal print feature vector of each speech data sample is divided into the training set of the first ratio and testing for the second ratio Card collection, first ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training, Verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
If the accuracy rate is more than predetermined threshold value, model training terminates, and institute is used as using the gauss hybrid models after training Step S2 background channel model is stated, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the speech data sample This quantity, and training is re-started based on the speech data sample after increase.
Wherein, when the vocal print feature vector in using training set is trained to gauss hybrid models, the D extracted The corresponding likelihood probability of dimension vocal print feature can be expressed as with K Gaussian component:
Wherein, P (x) is the probability (mixing that speech data sample is generated by gauss hybrid models Gauss model), wkFor the weight of each Gauss model, the probability that p (x | k) generate for sample by k-th of Gauss model, K is high This model quantity.
The parameter of whole gauss hybrid models can be expressed as:{wiii, wiFor the weight of i-th of Gauss model, μi For the average of i-th of Gauss model, ∑iFor the covariance of i-th of Gauss model.The gauss hybrid models are trained to use non-prison The EM algorithms superintended and directed.After the completion of training, the weight vectors of gauss hybrid models, constant vector, N number of covariance matrix, average are obtained It is multiplied by matrix of covariance etc., the gauss hybrid models after as one training.
Step S3, is calculated between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored Space length, carries out authentication, and generate the result based on the distance to the user.
Vector has a variety of with the distance between vector, including COS distance and Euclidean distance etc., it is preferable that the present embodiment Space length be COS distance, COS distance be using two vectorial angle cosine values in vector space be used as measurement two The measurement of the size of interindividual variation.
Wherein, standard vocal print discriminant vectorses are the vocal print discriminant vectorses for being obtained ahead of time and storing, standard vocal print discriminant vectorses The identification information of its corresponding user is carried in storage, it is capable of the identity of the corresponding user of accurate representation.Calculating space Before distance, the identification information provided according to user obtains the vocal print discriminant vectorses of storage.
Wherein, when calculating obtained space length less than or equal to pre-determined distance threshold value, it is verified, conversely, then verifying Failure.
Compared with prior art, the background channel model of the present embodiment training in advance generation is by a large amount of speech datas Excavation obtained with comparing training, this model can to greatest extent retain user vocal print feature while, accurately portray Background vocal print feature when user speaks, and can remove this feature in identification, and extract the intrinsic spy of user voice Levy, can significantly improve the accuracy rate of subscriber authentication, and improve the efficiency of authentication;In addition, the present embodiment is abundant Vocal print feature related to sound channel in voice is make use of, this vocal print feature simultaneously need not be any limitation as, thus entering to text There is larger flexibility during row identification and checking.
In a preferred embodiment, as shown in Fig. 2 on the basis of above-mentioned Fig. 1 embodiment, above-mentioned steps S1 bags Include:
Step S11, preemphasis, framing and windowing process are carried out to the speech data;In the present embodiment, receive into After the speech data of the user of row authentication, speech data is handled.Wherein, preemphasis processing is really high-pass filtering Processing, filters out low-frequency data so that the high frequency characteristics in speech data is more highlighted, specifically, the transmission function of high-pass filtering For:H (Z)=1- α Z-1, wherein, Z is speech data, and α is constant factor, it is preferable that α value is 0.97;Due to voice signal Stationarity is only presented within a short period of time, therefore one section of voice signal is divided into the signal (i.e. N frames) of N sections of short time, and in order to Avoid the continuity Characteristics of sound from losing, there is one section of repeat region between consecutive frame, repeat region is generally 1/2 per frame length; After framing is carried out to speech data, each frame signal is handled all as stationary signal, but the presence of Gibbs' effect, voice The start frame and end frame of data are discontinuous, after framing, more away from raw tone, accordingly, it would be desirable to voice number According to progress windowing process.
Step S12, carries out Fourier transform to each adding window and obtains corresponding frequency spectrum;
Step S13, inputs Mel wave filter by the frequency spectrum and obtains Mel frequency spectrum to export;
Step S14, carries out cepstral analysis to obtain mel-frequency cepstrum coefficient MFCC, based on described on Mel frequency spectrum Mel-frequency cepstrum coefficient MFCC constitutes corresponding vocal print feature vector.Wherein, cepstral analysis is, for example, to take the logarithm, do inversion Change, inverse transformation is realized generally by DCT discrete cosine transforms, take the 2nd after DCT to the 13rd coefficient as MFCC systems Number.Mel-frequency cepstrum coefficient MFCC is the vocal print feature of this frame speech data, by the mel-frequency cepstrum coefficient of every frame MFCC composition characteristic data matrixes, this feature data matrix is the vocal print feature vector of speech data.
In a preferred embodiment, as shown in figure 3, on the basis of above-mentioned Fig. 1 embodiment, upper step S3 includes:
Step S31, is calculated between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored COS distance: For the standard vocal print discriminant vectorses,For current vocal print discriminant vectorses;
Step S32, if the COS distance is less than or equal to default distance threshold, generates the letter being verified Breath;
Step S33, if the COS distance is more than default distance threshold, the information that generation checking does not pass through.
In a preferred embodiment, on the basis of above-mentioned Fig. 1 embodiment, above-mentioned step S3 is replaced with:Calculate Space length between the current vocal print discriminant vectorses and each standard vocal print discriminant vectorses prestored, obtain minimum space away from From carrying out authentication to the user based on the minimum space length, and generate the result.
The present embodiment from unlike Fig. 1 embodiment, the present embodiment store standard vocal print discriminant vectorses when do not take Identification information with user, when verifying the identity of user, calculates current vocal print discriminant vectorses and each standard vocal print prestored reflects Space length not between vector, and the space length of minimum is obtained, if the minimum space length is less than default distance Threshold value (distance threshold identical with the distance threshold of above-described embodiment or difference), then be verified, otherwise authentication failed.
Referring to Fig. 4, Fig. 4 is the operation ring of the preferred embodiment of system 10 of the authentication of the invention based on Application on Voiceprint Recognition Border schematic diagram.
In the present embodiment, the system 10 of the authentication based on Application on Voiceprint Recognition is installed and run in electronic installation 1.Electricity Sub-device 1 can be the computing devices such as desktop PC, notebook, palm PC and server.The electronic installation 1 can be wrapped Include, but be not limited only to, memory 11, processor 12 and display 13.Fig. 1 illustrate only the electronic installation with component 11-13 1, it should be understood that being not required for implementing all components shown, the more or less component of the implementation that can be substituted.
Memory 11 can be the internal storage unit of electronic installation 1 in certain embodiments, such as electronic installation 1 Hard disk or internal memory.Memory 11 can also be the External memory equipment of electronic installation 1 in further embodiments, and for example electronics is filled Put the plug-in type hard disk being equipped with 1, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card) etc..Further, memory 11 can also be both interior including electronic installation 1 Portion's memory cell also includes External memory equipment.Memory 11, which is used to store, is installed on the application software of electronic installation 1 and all kinds of Data, such as the program code of the system 10 of the authentication based on Application on Voiceprint Recognition.Memory 11 can be also used for temporarily Store the data that has exported or will export.
Processor 12 can be in certain embodiments a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chips, for the program code or processing data stored in run memory 11, example Such as perform the system 10 of the authentication based on Application on Voiceprint Recognition.
Display 13 can be in certain embodiments light-emitting diode display, liquid crystal display, touch-control liquid crystal display and OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) touches device etc..Display 13 is used to be shown in The information that is handled in electronic installation 1 and for showing visual user interface, such as Application on Voiceprint Recognition interface.Electronic installation 1 part 11-13 is in communication with each other by system bus.
Referring to Fig. 5, being the functional module of the preferred embodiment of system 10 of the authentication of the invention based on Application on Voiceprint Recognition Figure.In the present embodiment, the system 10 of the authentication based on Application on Voiceprint Recognition can be divided into one or more modules, one Or multiple modules are stored in memory 11, and held by one or more processors (the present embodiment is by processor 12) OK, to complete the present invention.For example, in Figure 5, the system 10 of the authentication based on Application on Voiceprint Recognition can be divided into detecting mould Block 21, identification module 22, replication module 23, installation module 24 and starting module 25.Module alleged by the present invention is to have referred to Into the series of computation machine programmed instruction section of specific function, than program more suitable for describing the authentication based on Application on Voiceprint Recognition The implementation procedure of system 10 in the electronic apparatus 1, wherein:
First acquisition module 101, for after the speech data for the user for carrying out authentication is received, obtaining institute's predicate The vocal print feature of sound data, and corresponding vocal print feature vector is built based on the vocal print feature;
In the present embodiment, speech data collects (voice capture device is, for example, microphone) by voice capture device, The system that the speech data of collection is sent to the authentication based on Application on Voiceprint Recognition by voice capture device.
When gathering speech data, should try one's best prevents the interference of ambient noise and voice capture device.Voice capture device Suitable distance is kept with user, and as far as possible without the big voice capture device of distortion, power supply preferably uses civil power, and keeps electric current It is stable;Sensor should be used when carrying out telephonograph., can be to voice number before the vocal print feature in extracting speech data According to carrying out going noise treatment, disturbed with further reduce.In order to extract the vocal print feature for obtaining speech data, gathered Speech data is the speech data of preset data length, or is the speech data more than preset data length.
Vocal print feature includes polytype, such as broadband vocal print, arrowband vocal print, amplitude vocal print, the vocal print of the present embodiment Be characterized as preferably speech data mel-frequency cepstrum coefficient (Mel Frequency Cepstrum Coefficient, MFCC).When building corresponding vocal print feature vector, by the vocal print feature composition characteristic data matrix of speech data, this feature Data matrix is the vocal print feature vector of speech data.
Module 102 is built, for by the background channel model of vocal print feature vector input training in advance generation, with structure Build out the corresponding current vocal print discriminant vectorses of the speech data;
Wherein, by the background channel model of vocal print feature vector input training in advance generation, it is preferable that the background channel mould Type is gauss hybrid models, and vocal print feature vector is calculated using the background channel model, show that corresponding current vocal print differentiates Vectorial (i.e. i-vector).
Specifically, the calculating process includes:
1) Gauss model, is selected:First, every frame data are calculated using the parameter in common background channel model in difference The likelihood logarithm value of Gauss model, by likelihood logarithm value matrix each column sorting in parallel, choosing top n Gauss model, finally Obtain a matrix per frame data numerical value in mixed Gauss model:
Loglike=E (X) * D (X)-1*XT-0.5*D(X)-1*(X.2)T,
Wherein, Loglike is likelihood logarithm value matrix, and E (X) is that common background channel model trains the average square come Battle array, D (X) is covariance matrix, and X is data matrix, X.2Each it is worth for matrix squared.
2) posterior probability, is calculated:X*XT calculating will be carried out per frame data X, and obtain a symmetrical matrix, three can be reduced to down Angular moment battle array, and element is arranged as 1 row in order, become the vector that a N frame is multiplied by the lower triangular matrix number latitude Calculated, the vector of all frames is combined into new data matrix, while the association for probability being calculated in universal background model Variance matrix, each matrix is also reduced to lower triangular matrix, become with matrix as new data matrix class, passing through common background Mean Matrix and covariance matrix in channel model calculate the likelihood logarithm under the Gauss model of the selection of every frame data Value, then carries out Softmax recurrence, operation is finally normalized, every frame is obtained in mixed Gauss model Posterior probability distribution, The ProbabilityDistribution Vector of every frame is constituted into probability matrix.
3) current vocal print discriminant vectorses, are extracted:Carry out single order first, the calculating of second order coefficient, coefficient of first order is calculated can be with Obtained by probability matrix row summation:
Wherein, GammaiFor i-th of element of coefficient of first order vector, loglikesjiFor The jth row of probability matrix, i-th of element.
Second order coefficient can be multiplied by data matrix acquisition by the transposition of probability matrix:
X=LoglikeT* feats, wherein, X is second order coefficient matrix, and loglike is probability matrix, and feats is characterized Data matrix.
After calculating obtains single order, second order coefficient, then parallel computation first order and quadratic term pass through first order and two Secondary item calculates current vocal print discriminant vectorses.
Preferably, background channel model is gauss hybrid models, and the system of the authentication based on Application on Voiceprint Recognition also includes:
Second acquisition module, for obtaining the speech data sample of predetermined number, and obtains each speech data sample correspondence Vocal print feature, and the corresponding vocal print feature of each speech data sample is built based on the corresponding vocal print feature of each speech data sample Vector;
Division module, for the corresponding vocal print feature vector of each speech data sample is divided into the first ratio training set and The checking collection of second ratio, first ratio and the second ratio and less than or equal to 1;
Training module, for being trained using the vocal print feature vector in the training set to gauss hybrid models, and After the completion of training, verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
Processing module, if being more than predetermined threshold value for the accuracy rate, model training terminates, and is mixed with the Gauss after training Matched moulds type as the background channel model, or, if the accuracy rate be less than or equal to predetermined threshold value, increase the voice number Training is re-started according to the quantity of sample, and based on the speech data sample after increase.
Wherein, when the vocal print feature vector in using training set is trained to gauss hybrid models, the D extracted The corresponding likelihood probability of dimension vocal print feature can be expressed as with K Gaussian component:
Wherein, P (x) is the probability (mixing that speech data sample is generated by gauss hybrid models Gauss model), wkFor the weight of each Gauss model, the probability that p (x | k) generate for sample by k-th of Gauss model, K is high This model quantity.
The parameter of whole gauss hybrid models can be expressed as:{wiii, wiFor the weight of i-th of Gauss model, μi For the average of i-th of Gauss model, ∑iFor the covariance of i-th of Gauss model.The gauss hybrid models are trained to use non-prison The EM algorithms superintended and directed.After the completion of training, the weight vectors of gauss hybrid models, constant vector, N number of covariance matrix, average are obtained It is multiplied by matrix of covariance etc., the gauss hybrid models after as one training.
First authentication module 103, for the standard vocal print for the user for calculating the current vocal print discriminant vectorses and prestoring Space length between discriminant vectorses, carries out authentication, and generate the result based on the distance to the user.
Vector has a variety of with the distance between vector, including COS distance and Euclidean distance etc., it is preferable that the present embodiment Space length be COS distance, COS distance be using two vectorial angle cosine values in vector space be used as measurement two The measurement of the size of interindividual variation.
Wherein, standard vocal print discriminant vectorses are the vocal print discriminant vectorses for being obtained ahead of time and storing, standard vocal print discriminant vectorses The identification information of its corresponding user is carried in storage, it is capable of the identity of the corresponding user of accurate representation.Calculating space Before distance, the identification information provided according to user obtains the vocal print discriminant vectorses of storage.
Wherein, when calculating obtained space length less than or equal to pre-determined distance threshold value, it is verified, conversely, then verifying Failure.
In a preferred embodiment, on the basis of above-mentioned Fig. 5 embodiment, above-mentioned first acquisition module 101 is specific For carrying out preemphasis, framing and windowing process to the speech data;Fourier transform is carried out to each adding window to obtain pair The frequency spectrum answered;The frequency spectrum is inputted into Mel wave filter and obtains Mel frequency spectrum to export;Cepstrum point is carried out on Mel frequency spectrum Analysis to obtain mel-frequency cepstrum coefficient MFCC, based on the mel-frequency cepstrum coefficient MFCC constitute corresponding vocal print feature to Amount.
Wherein, preemphasis processing is really high-pass filtering processing, filters out low-frequency data so that the high frequency in speech data is special Property is more highlighted, and specifically, the transmission function of high-pass filtering is:H (Z)=1- α Z-1, wherein, Z is speech data, and α is constant system Number, it is preferable that α value is 0.97;Because stationarity is only presented in voice signal within a short period of time, therefore one section of sound is believed Number it is divided into the signal (i.e. N frames) of N sections of short time, and is lost in order to avoid the continuity Characteristics of sound, has one section between consecutive frame Repeat region, repeat region is generally 1/2 per frame length;After framing is carried out to speech data, each frame signal is all as flat Steady signal is handled, but the presence of Gibbs' effect, the start frame and end frame of speech data be it is discontinuous, framing it Afterwards, more away from raw tone, accordingly, it would be desirable to carry out windowing process to speech data.
Wherein, cepstral analysis is, for example, to take the logarithm, do inverse transformation, and inverse transformation comes generally by DCT discrete cosine transforms Realize, take the 2nd after DCT to the 13rd coefficient as MFCC coefficients.Mel-frequency cepstrum coefficient MFCC is this frame voice The vocal print feature of data, by the mel-frequency cepstrum coefficient MFCC composition characteristic data matrixes of every frame, this feature data matrix is For the vocal print feature vector of speech data.
In a preferred embodiment, on the basis of above-mentioned Fig. 5 embodiment, first authentication module 103 is specific For calculating the COS distance between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored: For the standard vocal print discriminant vectorses,For current vocal print discriminant vectorses;If the COS distance is less than Or equal to default distance threshold, then generate the information being verified;If the COS distance is more than default distance threshold, The information that then generation checking does not pass through.
In a preferred embodiment, on the basis of above-mentioned Fig. 4 embodiment, the first above-mentioned authentication module is replaced with Second authentication module, for calculating the space between the current vocal print discriminant vectorses and each standard vocal print discriminant vectorses prestored Distance, obtains minimum space length, carries out authentication to the user based on the minimum space length, and generate checking As a result.
The present embodiment from unlike Fig. 5 embodiment, the present embodiment store standard vocal print discriminant vectorses when do not take Identification information with user, when verifying the identity of user, calculates current vocal print discriminant vectorses and each standard vocal print prestored reflects Space length not between vector, and the space length of minimum is obtained, if the minimum space length is less than default distance Threshold value (distance threshold identical with the distance threshold of above-described embodiment or difference), then be verified, otherwise authentication failed.
The foregoing is only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent substitution and improvements made etc. should be included in the scope of the protection.

Claims (10)

1. a kind of method of the authentication based on Application on Voiceprint Recognition, it is characterised in that the authentication based on Application on Voiceprint Recognition Method include:
S1, after the speech data for the user for carrying out authentication is received, obtains the vocal print feature of the speech data, and base Corresponding vocal print feature vector is built in the vocal print feature;
S2, by the background channel model of vocal print feature vector input training in advance generation, to construct the speech data Corresponding current vocal print discriminant vectorses;
S3, calculate space between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored away from From carrying out authentication to the user based on the distance, and generate the result.
2. the method for the authentication according to claim 1 based on Application on Voiceprint Recognition, it is characterised in that the step S1 bags Include:
S11, preemphasis, framing and windowing process are carried out to the speech data;
S12, carries out Fourier transform to each adding window and obtains corresponding frequency spectrum;
S13, inputs Mel wave filter by the frequency spectrum and obtains Mel frequency spectrum to export;
S14, carries out cepstral analysis to obtain mel-frequency cepstrum coefficient MFCC, based on the mel-frequency on Mel frequency spectrum Cepstrum coefficient MFCC constitutes corresponding vocal print feature vector.
3. the method for the authentication according to claim 1 based on Application on Voiceprint Recognition, it is characterised in that the step S3 bags Include:
S31, calculate cosine between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored away from From: For the standard vocal print discriminant vectorses,For current vocal print discriminant vectorses;
S32, if the COS distance is less than or equal to default distance threshold, generates the information being verified;
S33, if the COS distance is more than default distance threshold, the information that generation checking does not pass through.
4. the method for the authentication based on Application on Voiceprint Recognition according to any one of claims 1 to 3, it is characterised in that institute Background channel model is stated for gauss hybrid models, is included before the step S1:
The speech data sample of predetermined number is obtained, and obtains the corresponding vocal print feature of each speech data sample, and based on each language The corresponding vocal print feature of sound data sample builds the corresponding vocal print feature vector of each speech data sample;
The corresponding vocal print feature vector of each speech data sample is divided into the training set of the first ratio and the checking collection of the second ratio, First ratio and the second ratio and less than or equal to 1;
Gauss hybrid models are trained using the vocal print feature vector in the training set, and after the completion of training, utilized The accuracy rate of gauss hybrid models after the checking set pair training is verified;
If the accuracy rate is more than predetermined threshold value, model training terminates, and the step is used as using the gauss hybrid models after training Rapid S2 background channel model, or, if the accuracy rate is less than or equal to predetermined threshold value, increase the speech data sample Quantity, and training is re-started based on the speech data sample after increase.
5. the method for the authentication according to claim 1 or 2 based on Application on Voiceprint Recognition, it is characterised in that the step S3 is replaced with:The space length between the current vocal print discriminant vectorses and each standard vocal print discriminant vectorses prestored is calculated, is obtained The space length of minimum is taken, authentication is carried out to the user based on the minimum space length, and generate the result.
6. a kind of system of the authentication based on Application on Voiceprint Recognition, it is characterised in that the authentication based on Application on Voiceprint Recognition System include:
First acquisition module, for after the speech data for the user for carrying out authentication is received, obtaining the speech data Vocal print feature, and corresponding vocal print feature vector is built based on the vocal print feature;
Module is built, for by the background channel model of vocal print feature vector input training in advance generation, to construct State the corresponding current vocal print discriminant vectorses of speech data;
First authentication module, for the standard vocal print discriminant vectorses for the user for calculating the current vocal print discriminant vectorses and prestoring Between space length, authentication is carried out to the user based on the distance, and the result is generated.
7. the system of the authentication according to claim 6 based on Application on Voiceprint Recognition, it is characterised in that described first obtains Module to the speech data specifically for carrying out preemphasis, framing and windowing process;Fourier change is carried out to each adding window Get corresponding frequency spectrum in return;The frequency spectrum is inputted into Mel wave filter and obtains Mel frequency spectrum to export;Enter on Mel frequency spectrum Row cepstral analysis constitutes corresponding sound to obtain mel-frequency cepstrum coefficient MFCC based on the mel-frequency cepstrum coefficient MFCC Line characteristic vector.
8. the system of the authentication according to claim 6 based on Application on Voiceprint Recognition, it is characterised in that first checking Module is specifically for more than calculating between the current vocal print discriminant vectorses and the standard vocal print discriminant vectorses of the user prestored Chordal distance: For the standard vocal print discriminant vectorses,For current vocal print discriminant vectorses;If the cosine Distance is less than or equal to default distance threshold, then generates the information being verified;If the COS distance is more than default Distance threshold, the then information that generation checking does not pass through.
9. the system of the authentication based on Application on Voiceprint Recognition according to any one of claim 6 to 8, it is characterised in that institute Stating the system of the authentication based on Application on Voiceprint Recognition also includes:
Second acquisition module, for obtaining the speech data sample of predetermined number, and obtains the corresponding sound of each speech data sample Line feature, and based on the corresponding vocal print feature of each speech data sample build the corresponding vocal print feature of each speech data sample to Amount;
Division module, the training set and second for the corresponding vocal print feature vector of each speech data sample to be divided into the first ratio The checking collection of ratio, first ratio and the second ratio and less than or equal to 1;
Training module, for being trained using the vocal print feature vector in the training set to gauss hybrid models, and in instruction After the completion of white silk, verified using the accuracy rate of the gauss hybrid models after the checking set pair training;
Processing module, if being more than predetermined threshold value for the accuracy rate, model training terminates, with the Gaussian Mixture mould after training Type as the background channel model, or, if the accuracy rate be less than or equal to predetermined threshold value, increase the speech data sample This quantity, and training is re-started based on the speech data sample after increase.
10. the system of the authentication based on Application on Voiceprint Recognition according to claim 6 or 7, it is characterised in that described first Authentication module replaces with the second authentication module, and each standard vocal print for calculating the current vocal print discriminant vectorses and prestoring differentiates Space length between vector, obtains minimum space length, and identity is carried out to the user based on the minimum space length Checking, and generate the result.
CN201710147695.XA 2017-03-13 2017-03-13 The method and system of authentication based on Application on Voiceprint Recognition Pending CN107068154A (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN201710147695.XA CN107068154A (en) 2017-03-13 2017-03-13 The method and system of authentication based on Application on Voiceprint Recognition
PCT/CN2017/091361 WO2018166112A1 (en) 2017-03-13 2017-06-30 Voiceprint recognition-based identity verification method, electronic device, and storage medium
CN201710715433.9A CN107517207A (en) 2017-03-13 2017-08-20 Server, auth method and computer-readable recording medium
PCT/CN2017/105031 WO2018166187A1 (en) 2017-03-13 2017-09-30 Server, identity verification method and system, and a computer-readable storage medium
TW106135250A TWI641965B (en) 2017-03-13 2017-10-13 Method and system of authentication based on voiceprint recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710147695.XA CN107068154A (en) 2017-03-13 2017-03-13 The method and system of authentication based on Application on Voiceprint Recognition

Publications (1)

Publication Number Publication Date
CN107068154A true CN107068154A (en) 2017-08-18

Family

ID=59622093

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201710147695.XA Pending CN107068154A (en) 2017-03-13 2017-03-13 The method and system of authentication based on Application on Voiceprint Recognition
CN201710715433.9A Pending CN107517207A (en) 2017-03-13 2017-08-20 Server, auth method and computer-readable recording medium

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201710715433.9A Pending CN107517207A (en) 2017-03-13 2017-08-20 Server, auth method and computer-readable recording medium

Country Status (3)

Country Link
CN (2) CN107068154A (en)
TW (1) TWI641965B (en)
WO (2) WO2018166112A1 (en)

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107993071A (en) * 2017-11-21 2018-05-04 平安科技(深圳)有限公司 Electronic device, auth method and storage medium based on vocal print
CN108154371A (en) * 2018-01-12 2018-06-12 平安科技(深圳)有限公司 Electronic device, the method for authentication and storage medium
CN108172230A (en) * 2018-01-03 2018-06-15 平安科技(深圳)有限公司 Voiceprint registration method, terminal installation and storage medium based on Application on Voiceprint Recognition model
CN108269575A (en) * 2018-01-12 2018-07-10 平安科技(深圳)有限公司 Update audio recognition method, terminal installation and the storage medium of voice print database
WO2018166187A1 (en) * 2017-03-13 2018-09-20 平安科技(深圳)有限公司 Server, identity verification method and system, and a computer-readable storage medium
CN108766444A (en) * 2018-04-09 2018-11-06 平安科技(深圳)有限公司 User ID authentication method, server and storage medium
CN108806695A (en) * 2018-04-17 2018-11-13 平安科技(深圳)有限公司 Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh
CN109101801A (en) * 2018-07-12 2018-12-28 北京百度网讯科技有限公司 Method for identity verification, device, equipment and computer readable storage medium
CN109256138A (en) * 2018-08-13 2019-01-22 平安科技(深圳)有限公司 Auth method, terminal device and computer readable storage medium
CN109257362A (en) * 2018-10-11 2019-01-22 平安科技(深圳)有限公司 Method, apparatus, computer equipment and the storage medium of voice print verification
WO2019019256A1 (en) * 2017-07-25 2019-01-31 平安科技(深圳)有限公司 Electronic apparatus, identity verification method and system, and computer-readable storage medium
CN109360573A (en) * 2018-11-13 2019-02-19 平安科技(深圳)有限公司 Livestock method for recognizing sound-groove, device, terminal device and computer storage medium
CN109378002A (en) * 2018-10-11 2019-02-22 平安科技(深圳)有限公司 Method, apparatus, computer equipment and the storage medium of voice print verification
CN109377662A (en) * 2018-09-29 2019-02-22 途客易达(天津)网络科技有限公司 Charging pile control method, device and electronic equipment
CN109473105A (en) * 2018-10-26 2019-03-15 平安科技(深圳)有限公司 The voice print verification method, apparatus unrelated with text and computer equipment
CN109493873A (en) * 2018-11-13 2019-03-19 平安科技(深圳)有限公司 Livestock method for recognizing sound-groove, device, terminal device and computer storage medium
CN109524026A (en) * 2018-10-26 2019-03-26 北京网众共创科技有限公司 The determination method and device of prompt tone, storage medium, electronic device
CN109636630A (en) * 2018-12-07 2019-04-16 泰康保险集团股份有限公司 Method, apparatus, medium and electronic equipment of the detection for behavior of insuring
CN110298150A (en) * 2019-05-29 2019-10-01 上海拍拍贷金融信息服务有限公司 A kind of auth method and system based on speech recognition
CN110473569A (en) * 2019-09-11 2019-11-19 苏州思必驰信息科技有限公司 Detect the optimization method and system of speaker's spoofing attack
CN110738998A (en) * 2019-09-11 2020-01-31 深圳壹账通智能科技有限公司 Voice-based personal credit evaluation method, device, terminal and storage medium
CN110867189A (en) * 2018-08-28 2020-03-06 北京京东尚科信息技术有限公司 Login method and device
CN110880325A (en) * 2018-09-05 2020-03-13 华为技术有限公司 Identity recognition method and equipment
CN110971755A (en) * 2019-11-18 2020-04-07 武汉大学 Double-factor identity authentication method based on PIN code and pressure code
CN111402899A (en) * 2020-03-25 2020-07-10 中国工商银行股份有限公司 Cross-channel voiceprint identification method and device
CN111625704A (en) * 2020-05-11 2020-09-04 镇江纵陌阡横信息科技有限公司 Non-personalized recommendation algorithm model based on user intention and data cooperation
CN111899566A (en) * 2020-08-11 2020-11-06 南京畅淼科技有限责任公司 Ship traffic management system based on AIS
CN112289324A (en) * 2020-10-27 2021-01-29 湖南华威金安企业管理有限公司 Voiceprint identity recognition method and device and electronic equipment
CN112802481A (en) * 2021-04-06 2021-05-14 北京远鉴信息技术有限公司 Voiceprint verification method, voiceprint recognition model training method, device and equipment
CN113421575A (en) * 2021-06-30 2021-09-21 平安科技(深圳)有限公司 Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium
CN114780787A (en) * 2022-04-01 2022-07-22 杭州半云科技有限公司 Voiceprint retrieval method, identity verification method, identity registration method and device
CN114826709A (en) * 2022-04-15 2022-07-29 马上消费金融股份有限公司 Identity authentication and acoustic environment detection method, system, electronic device and medium

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108091326B (en) * 2018-02-11 2021-08-06 张晓雷 Voiceprint recognition method and system based on linear regression
CN108768654B (en) * 2018-04-09 2020-04-21 平安科技(深圳)有限公司 Identity verification method based on voiceprint recognition, server and storage medium
CN108694952B (en) * 2018-04-09 2020-04-28 平安科技(深圳)有限公司 Electronic device, identity authentication method and storage medium
CN108447489B (en) * 2018-04-17 2020-05-22 清华大学 Continuous voiceprint authentication method and system with feedback
CN108630208B (en) * 2018-05-14 2020-10-27 平安科技(深圳)有限公司 Server, voiceprint-based identity authentication method and storage medium
CN108650266B (en) * 2018-05-14 2020-02-18 平安科技(深圳)有限公司 Server, voiceprint verification method and storage medium
CN108834138B (en) * 2018-05-25 2022-05-24 北京国联视讯信息技术股份有限公司 Network distribution method and system based on voiceprint data
CN109087647B (en) * 2018-08-03 2023-06-13 平安科技(深圳)有限公司 Voiceprint recognition processing method and device, electronic equipment and storage medium
CN109450850B (en) * 2018-09-26 2022-10-11 深圳壹账通智能科技有限公司 Identity authentication method, identity authentication device, computer equipment and storage medium
CN109147797B (en) * 2018-10-18 2024-05-07 平安科技(深圳)有限公司 Customer service method, device, computer equipment and storage medium based on voiceprint recognition
CN110046910B (en) * 2018-12-13 2023-04-14 蚂蚁金服(杭州)网络技术有限公司 Method and equipment for judging validity of transaction performed by customer through electronic payment platform
CN109816508A (en) * 2018-12-14 2019-05-28 深圳壹账通智能科技有限公司 Method for authenticating user identity, device based on big data, computer equipment
CN109473108A (en) * 2018-12-15 2019-03-15 深圳壹账通智能科技有限公司 Auth method, device, equipment and storage medium based on Application on Voiceprint Recognition
CN109545226B (en) * 2019-01-04 2022-11-22 平安科技(深圳)有限公司 Voice recognition method, device and computer readable storage medium
CN110322888B (en) * 2019-05-21 2023-05-30 平安科技(深圳)有限公司 Credit card unlocking method, apparatus, device and computer readable storage medium
CN110334603A (en) * 2019-06-06 2019-10-15 视联动力信息技术股份有限公司 Authentication system
CN111597531A (en) * 2020-04-07 2020-08-28 北京捷通华声科技股份有限公司 Identity authentication method and device, electronic equipment and readable storage medium
CN111710340A (en) * 2020-06-05 2020-09-25 深圳市卡牛科技有限公司 Method, device, server and storage medium for identifying user identity based on voice
CN111613230A (en) * 2020-06-24 2020-09-01 泰康保险集团股份有限公司 Voiceprint verification method, voiceprint verification device, voiceprint verification equipment and storage medium
CN112669841B (en) * 2020-12-18 2024-07-02 平安科技(深圳)有限公司 Training method and device for generating model of multilingual voice and computer equipment
CN113889120A (en) * 2021-09-28 2022-01-04 北京百度网讯科技有限公司 Voiceprint feature extraction method and device, electronic equipment and storage medium

Family Cites Families (35)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8645137B2 (en) * 2000-03-16 2014-02-04 Apple Inc. Fast, language-independent method for user authentication by voice
CN1170239C (en) * 2002-09-06 2004-10-06 浙江大学 Palm acoustic-print verifying system
TWI234762B (en) * 2003-12-22 2005-06-21 Top Dihital Co Ltd Voiceprint identification system for e-commerce
US7447633B2 (en) * 2004-11-22 2008-11-04 International Business Machines Corporation Method and apparatus for training a text independent speaker recognition system using speech data with text labels
US7536304B2 (en) * 2005-05-27 2009-05-19 Porticus, Inc. Method and system for bio-metric voice print authentication
CN101064043A (en) * 2006-04-29 2007-10-31 上海优浪信息科技有限公司 Sound-groove gate inhibition system and uses thereof
CN102479511A (en) * 2010-11-23 2012-05-30 盛乐信息技术(上海)有限公司 Large-scale voiceprint authentication method and system
TW201301261A (en) * 2011-06-27 2013-01-01 Hon Hai Prec Ind Co Ltd Identity authentication system and method thereof
CN102238190B (en) * 2011-08-01 2013-12-11 安徽科大讯飞信息科技股份有限公司 Identity authentication method and system
CN102509547B (en) * 2011-12-29 2013-06-19 辽宁工业大学 Method and system for voiceprint recognition based on vector quantization based
US9042867B2 (en) * 2012-02-24 2015-05-26 Agnitio S.L. System and method for speaker recognition on mobile devices
CN102695112A (en) * 2012-06-09 2012-09-26 九江妙士酷实业有限公司 Automobile player and volume control method thereof
CN102820033B (en) * 2012-08-17 2013-12-04 南京大学 Voiceprint identification method
CN102916815A (en) * 2012-11-07 2013-02-06 华为终端有限公司 Method and device for checking identity of user
CN103220286B (en) * 2013-04-10 2015-02-25 郑方 Identity verification system and identity verification method based on dynamic password voice
CN104427076A (en) * 2013-08-30 2015-03-18 中兴通讯股份有限公司 Recognition method and recognition device for automatic answering of calling system
CN103632504A (en) * 2013-12-17 2014-03-12 上海电机学院 Silence reminder for library
CN104765996B (en) * 2014-01-06 2018-04-27 讯飞智元信息科技有限公司 Voiceprint password authentication method and system
CN104978507B (en) * 2014-04-14 2019-02-01 中国石油化工集团公司 A kind of Intelligent controller for logging evaluation expert system identity identifying method based on Application on Voiceprint Recognition
CN105100911A (en) * 2014-05-06 2015-11-25 夏普株式会社 Intelligent multimedia system and method
CN103986725A (en) * 2014-05-29 2014-08-13 中国农业银行股份有限公司 Client side, server side and identity authentication system and method
CN104157301A (en) * 2014-07-25 2014-11-19 广州三星通信技术研究有限公司 Method, device and terminal deleting voice information blank segment
CN105321293A (en) * 2014-09-18 2016-02-10 广东小天才科技有限公司 Danger detection reminding method and intelligent equipment
CN104485102A (en) * 2014-12-23 2015-04-01 智慧眼(湖南)科技发展有限公司 Voiceprint recognition method and device
CN104751845A (en) * 2015-03-31 2015-07-01 江苏久祥汽车电器集团有限公司 Voice recognition method and system used for intelligent robot
CN104992708B (en) * 2015-05-11 2018-07-24 国家计算机网络与信息安全管理中心 Specific audio detection model generation in short-term and detection method
CN105096955B (en) * 2015-09-06 2019-02-01 广东外语外贸大学 A kind of speaker's method for quickly identifying and system based on model growth cluster
CN105611461B (en) * 2016-01-04 2019-12-17 浙江宇视科技有限公司 Noise suppression method, device and system for front-end equipment voice application system
CN105575394A (en) * 2016-01-04 2016-05-11 北京时代瑞朗科技有限公司 Voiceprint identification method based on global change space and deep learning hybrid modeling
CN106971717A (en) * 2016-01-14 2017-07-21 芋头科技(杭州)有限公司 Robot and audio recognition method, the device of webserver collaborative process
CN105869645B (en) * 2016-03-25 2019-04-12 腾讯科技(深圳)有限公司 Voice data processing method and device
CN106210323B (en) * 2016-07-13 2019-09-24 Oppo广东移动通信有限公司 A kind of speech playing method and terminal device
CN106169295B (en) * 2016-07-15 2019-03-01 腾讯科技(深圳)有限公司 Identity vector generation method and device
CN106373576B (en) * 2016-09-07 2020-07-21 Tcl科技集团股份有限公司 Speaker confirmation method and system based on VQ and SVM algorithms
CN107068154A (en) * 2017-03-13 2017-08-18 平安科技(深圳)有限公司 The method and system of authentication based on Application on Voiceprint Recognition

Cited By (48)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018166187A1 (en) * 2017-03-13 2018-09-20 平安科技(深圳)有限公司 Server, identity verification method and system, and a computer-readable storage medium
WO2019019256A1 (en) * 2017-07-25 2019-01-31 平安科技(深圳)有限公司 Electronic apparatus, identity verification method and system, and computer-readable storage medium
US11068571B2 (en) 2017-07-25 2021-07-20 Ping An Technology (Shenzhen) Co., Ltd. Electronic device, method and system of identity verification and computer readable storage medium
WO2019100606A1 (en) * 2017-11-21 2019-05-31 平安科技(深圳)有限公司 Electronic device, voiceprint-based identity verification method and system, and storage medium
CN107993071A (en) * 2017-11-21 2018-05-04 平安科技(深圳)有限公司 Electronic device, auth method and storage medium based on vocal print
CN108172230A (en) * 2018-01-03 2018-06-15 平安科技(深圳)有限公司 Voiceprint registration method, terminal installation and storage medium based on Application on Voiceprint Recognition model
CN108269575B (en) * 2018-01-12 2021-11-02 平安科技(深圳)有限公司 Voice recognition method for updating voiceprint data, terminal device and storage medium
WO2019136912A1 (en) * 2018-01-12 2019-07-18 平安科技(深圳)有限公司 Electronic device, identity authentication method and system, and storage medium
CN108269575A (en) * 2018-01-12 2018-07-10 平安科技(深圳)有限公司 Update audio recognition method, terminal installation and the storage medium of voice print database
CN108154371A (en) * 2018-01-12 2018-06-12 平安科技(深圳)有限公司 Electronic device, the method for authentication and storage medium
CN108766444A (en) * 2018-04-09 2018-11-06 平安科技(深圳)有限公司 User ID authentication method, server and storage medium
CN108766444B (en) * 2018-04-09 2020-11-03 平安科技(深圳)有限公司 User identity authentication method, server and storage medium
CN108806695A (en) * 2018-04-17 2018-11-13 平安科技(深圳)有限公司 Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh
WO2019200744A1 (en) * 2018-04-17 2019-10-24 平安科技(深圳)有限公司 Self-updated anti-fraud method and apparatus, computer device and storage medium
CN109101801A (en) * 2018-07-12 2018-12-28 北京百度网讯科技有限公司 Method for identity verification, device, equipment and computer readable storage medium
US11294995B2 (en) 2018-07-12 2022-04-05 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus for identity authentication, and computer readable storage medium
CN109101801B (en) * 2018-07-12 2021-04-27 北京百度网讯科技有限公司 Method, apparatus, device and computer readable storage medium for identity authentication
CN109256138A (en) * 2018-08-13 2019-01-22 平安科技(深圳)有限公司 Auth method, terminal device and computer readable storage medium
CN109256138B (en) * 2018-08-13 2023-07-07 平安科技(深圳)有限公司 Identity verification method, terminal device and computer readable storage medium
CN110867189A (en) * 2018-08-28 2020-03-06 北京京东尚科信息技术有限公司 Login method and device
CN110880325B (en) * 2018-09-05 2022-06-28 华为技术有限公司 Identity recognition method and equipment
CN110880325A (en) * 2018-09-05 2020-03-13 华为技术有限公司 Identity recognition method and equipment
CN109377662A (en) * 2018-09-29 2019-02-22 途客易达(天津)网络科技有限公司 Charging pile control method, device and electronic equipment
CN109378002B (en) * 2018-10-11 2024-05-07 平安科技(深圳)有限公司 Voiceprint verification method, voiceprint verification device, computer equipment and storage medium
CN109378002A (en) * 2018-10-11 2019-02-22 平安科技(深圳)有限公司 Method, apparatus, computer equipment and the storage medium of voice print verification
CN109257362A (en) * 2018-10-11 2019-01-22 平安科技(深圳)有限公司 Method, apparatus, computer equipment and the storage medium of voice print verification
CN109524026A (en) * 2018-10-26 2019-03-26 北京网众共创科技有限公司 The determination method and device of prompt tone, storage medium, electronic device
CN109473105A (en) * 2018-10-26 2019-03-15 平安科技(深圳)有限公司 The voice print verification method, apparatus unrelated with text and computer equipment
CN109524026B (en) * 2018-10-26 2022-04-26 北京网众共创科技有限公司 Method and device for determining prompt tone, storage medium and electronic device
CN109493873A (en) * 2018-11-13 2019-03-19 平安科技(深圳)有限公司 Livestock method for recognizing sound-groove, device, terminal device and computer storage medium
CN109360573A (en) * 2018-11-13 2019-02-19 平安科技(深圳)有限公司 Livestock method for recognizing sound-groove, device, terminal device and computer storage medium
CN109636630A (en) * 2018-12-07 2019-04-16 泰康保险集团股份有限公司 Method, apparatus, medium and electronic equipment of the detection for behavior of insuring
CN110298150A (en) * 2019-05-29 2019-10-01 上海拍拍贷金融信息服务有限公司 A kind of auth method and system based on speech recognition
CN110298150B (en) * 2019-05-29 2021-11-26 上海拍拍贷金融信息服务有限公司 Identity verification method and system based on voice recognition
CN110738998A (en) * 2019-09-11 2020-01-31 深圳壹账通智能科技有限公司 Voice-based personal credit evaluation method, device, terminal and storage medium
CN110473569A (en) * 2019-09-11 2019-11-19 苏州思必驰信息科技有限公司 Detect the optimization method and system of speaker's spoofing attack
CN110971755A (en) * 2019-11-18 2020-04-07 武汉大学 Double-factor identity authentication method based on PIN code and pressure code
CN111402899A (en) * 2020-03-25 2020-07-10 中国工商银行股份有限公司 Cross-channel voiceprint identification method and device
CN111402899B (en) * 2020-03-25 2023-10-13 中国工商银行股份有限公司 Cross-channel voiceprint recognition method and device
CN111625704A (en) * 2020-05-11 2020-09-04 镇江纵陌阡横信息科技有限公司 Non-personalized recommendation algorithm model based on user intention and data cooperation
CN111899566A (en) * 2020-08-11 2020-11-06 南京畅淼科技有限责任公司 Ship traffic management system based on AIS
CN112289324A (en) * 2020-10-27 2021-01-29 湖南华威金安企业管理有限公司 Voiceprint identity recognition method and device and electronic equipment
CN112289324B (en) * 2020-10-27 2024-05-10 湖南华威金安企业管理有限公司 Voiceprint identity recognition method and device and electronic equipment
CN112802481A (en) * 2021-04-06 2021-05-14 北京远鉴信息技术有限公司 Voiceprint verification method, voiceprint recognition model training method, device and equipment
CN113421575A (en) * 2021-06-30 2021-09-21 平安科技(深圳)有限公司 Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium
CN113421575B (en) * 2021-06-30 2024-02-06 平安科技(深圳)有限公司 Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium
CN114780787A (en) * 2022-04-01 2022-07-22 杭州半云科技有限公司 Voiceprint retrieval method, identity verification method, identity registration method and device
CN114826709A (en) * 2022-04-15 2022-07-29 马上消费金融股份有限公司 Identity authentication and acoustic environment detection method, system, electronic device and medium

Also Published As

Publication number Publication date
WO2018166112A1 (en) 2018-09-20
TW201833810A (en) 2018-09-16
WO2018166187A1 (en) 2018-09-20
TWI641965B (en) 2018-11-21
CN107517207A (en) 2017-12-26

Similar Documents

Publication Publication Date Title
CN107068154A (en) The method and system of authentication based on Application on Voiceprint Recognition
CN107527620B (en) Electronic device, the method for authentication and computer readable storage medium
CN110457432B (en) Interview scoring method, interview scoring device, interview scoring equipment and interview scoring storage medium
CN112562691B (en) Voiceprint recognition method, voiceprint recognition device, computer equipment and storage medium
CN102737633B (en) Method and device for recognizing speaker based on tensor subspace analysis
CN109524014A (en) A kind of Application on Voiceprint Recognition analysis method based on depth convolutional neural networks
CN107993071A (en) Electronic device, auth method and storage medium based on vocal print
CN107610707A (en) A kind of method for recognizing sound-groove and device
CN111694940B (en) User report generation method and terminal equipment
CN106847292A (en) Method for recognizing sound-groove and device
CN111933154B (en) Method, equipment and computer readable storage medium for recognizing fake voice
CN108154371A (en) Electronic device, the method for authentication and storage medium
CN105096955A (en) Speaker rapid identification method and system based on growing and clustering algorithm of models
CN110473552A (en) Speech recognition authentication method and system
CN116153337B (en) Synthetic voice tracing evidence obtaining method and device, electronic equipment and storage medium
CN115083422B (en) Voice traceability evidence obtaining method and device, equipment and storage medium
CN109545226A (en) A kind of audio recognition method, equipment and computer readable storage medium
Zhang et al. Temporal Transformer Networks for Acoustic Scene Classification.
CN113793620B (en) Voice noise reduction method, device and equipment based on scene classification and storage medium
CN108650266A (en) Server, the method for voice print verification and storage medium
CN108630208A (en) Server, auth method and storage medium based on vocal print
CN117253490A (en) Conformer-based speaker verification method and system
CN110047491A (en) A kind of relevant method for distinguishing speek person of random digit password and device
CN113870887A (en) Single-channel speech enhancement method and device, computer equipment and storage medium
CN117935813B (en) Voiceprint recognition method and voiceprint recognition system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170818

WD01 Invention patent application deemed withdrawn after publication