CN109493873A

CN109493873A - Livestock method for recognizing sound-groove, device, terminal device and computer storage medium

Info

Publication number: CN109493873A
Application number: CN201811348858.1A
Authority: CN
Inventors: 王健宗; 彭俊清; 肖京
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-11-13
Filing date: 2018-11-13
Publication date: 2019-03-19

Abstract

This application discloses a kind of livestock method for recognizing sound-groove, device, terminal device and computer readable storage mediums, are related to intelligent decision field, this method comprises: obtaining the voice messaging to be identified of identity livestock to be confirmed；The vocal print feature to be identified of voice messaging to be identified is extracted according to deep neural network；Vocal print feature to be identified is input to Application on Voiceprint Recognition model, obtains the identity information of identity livestock to be confirmed, Application on Voiceprint Recognition model is used to obtain identity information according to vocal print feature.Scheme in the embodiment of the present application, the vocal print feature of voice messaging to be identified can be extracted based on deep neural network, then the identity information of the identity livestock to be confirmed is identified by Application on Voiceprint Recognition model, due to extracting the vocal print feature of voice messaging to be identified by deep neural network, the redundancy in voice messaging can be removed, the extraction accuracy for improving vocal print feature, to improve the accuracy of livestock identification.

Description

Livestock voiceprint recognition method and device, terminal equipment and computer storage medium

Technical Field

The application relates to the technical field of intelligent decision making, in particular to a livestock voiceprint recognition method, device, terminal equipment and computer storage medium.

Background

In order to facilitate the management of the livestock, the livestock needs to be managed according to the identity of the livestock, in the prior art, the facial features of the livestock with the identity to be recognized are generally compared with the facial features of the livestock stored in a database, the facial features of the livestock with known identities are stored in the database, and if the facial features are matched in the database, the identity information of the livestock to be recognized can be determined. However, with the increasing of livestock with unknown identities, the identity information of the livestock with unknown identities cannot be identified through the livestock with known identities in the database, so that the identification result of the livestock identity is inaccurate, the similarity of the facial features of the same type of livestock is large, and the identification result of the livestock is also inaccurate.

Disclosure of Invention

The purpose of this application aims at solving at least one of above-mentioned technical defect, improves the degree of accuracy of livestock identity recognition result. The technical scheme adopted by the application is as follows:

in a first aspect, the present application provides a livestock voiceprint recognition method, comprising:

acquiring voice information to be recognized of livestock to be identified;

extracting voiceprint features to be recognized of the voice information to be recognized according to the deep neural network;

and inputting the voiceprint characteristics to be recognized into a voiceprint recognition model to obtain the identity information of the livestock with the identity to be confirmed, wherein the voiceprint recognition model is used for obtaining the identity information according to the voiceprint characteristics.

In an optional embodiment of the application, the voiceprint recognition model is a model obtained by carrying out adaptive training on an original voiceprint recognition model based on the voiceprint characteristics of the livestock with newly confirmed identity, and the original voiceprint recognition model is a model obtained by carrying out pre-training on the voiceprint characteristics based on original voice information and corresponding identity information.

In an alternative embodiment of the present application, the method further comprises:

and carrying out self-adaptive training on the original identity recognition model based on the voiceprint characteristics of the livestock with the newly confirmed identity to obtain the voiceprint recognition model.

In the optional embodiment of this application, carry out the adaptive training to original identity recognition model based on the voiceprint characteristic of the livestock of newly confirming the identity and obtain the voiceprint recognition model, include:

adding voiceprint characteristics and identity information of the newly-identified livestock to a livestock voice information base, wherein the livestock voice information base is used for storing the corresponding relation between the voiceprint characteristics and the identity information, and the corresponding relation between the voiceprint characteristics and the identity information comprises the corresponding relation between the voiceprint characteristics of the original voice information and the corresponding identity information;

and carrying out self-adaptive training on the original voiceprint recognition model according to all voiceprint characteristics in the livestock voice information base and corresponding identity information to obtain the voiceprint recognition model.

In an optional embodiment of the present application, the original voiceprint recognition model includes a voiceprint vector extractor and an identity recognition model, the voiceprint vector extractor is configured to obtain a voiceprint feature vector according to a voiceprint feature of the voice information, and the identity recognition model is configured to obtain identity information according to the voiceprint feature vector;

according to all voiceprint characteristics in the livestock voice information base and corresponding identity information, carrying out self-adaptive training on an original voiceprint recognition model to obtain the voiceprint recognition model, and the method comprises the following steps:

determining the voiceprint feature vectors of all the voiceprint features according to the voiceprint vector extractor;

and carrying out self-adaptive training on the identity recognition model according to all the voiceprint feature vectors and the corresponding identity information to obtain an updated identity recognition model, wherein the voiceprint recognition model comprises a voiceprint vector extractor and the updated identity recognition model.

In the optional embodiment of this application, will wait to discern the voiceprint characteristic and input to the voiceprint recognition model, obtain the identity information of waiting to confirm identity livestock, include:

determining a voiceprint feature vector to be identified of the voiceprint feature to be identified according to the voiceprint vector extractor;

and determining identity information corresponding to the voiceprint feature vector to be recognized according to the updated identity recognition model.

In an optional embodiment of the present application, determining, according to the updated identity recognition model, identity information corresponding to the voiceprint feature vector to be recognized includes:

determining the probability value of the voiceprint characteristic vector corresponding to each voice message in all the voice messages in the livestock voice message library belonging to the voiceprint characteristic vector to be recognized according to the updated identity recognition model;

and determining the identity information corresponding to the voice information with the probability value larger than the preset threshold value as the identity information corresponding to the voice information to be recognized.

In a second aspect, the present application provides a livestock voiceprint recognition apparatus comprising:

the voice information acquisition module is used for acquiring voice information to be recognized of the livestock to be identified;

the voiceprint feature extraction module is used for extracting voiceprint features to be recognized of the voice information to be recognized according to the deep neural network;

and the voiceprint recognition module is used for inputting the voiceprint characteristics to be recognized into the voiceprint recognition model to obtain the identity information of the livestock with the identity to be confirmed, and the voiceprint recognition model is used for obtaining the identity information according to the voiceprint characteristics.

In an alternative embodiment of the present application, the apparatus further comprises:

and the model training module is used for carrying out self-adaptive training on the original voiceprint recognition model based on the voiceprint characteristics of the livestock with the newly confirmed identity to obtain the voiceprint recognition model.

In the optional embodiment of this application, model training module is when carrying out the adaptive training to original voiceprint recognition model and obtaining the voiceprint recognition model based on the voiceprint characteristic of the livestock of newly confirming the identity, specifically is used for:

the model training module carries out self-adaptation training to original voiceprint recognition model according to all voiceprint features in the livestock voice information base and corresponding identity information, and when obtaining the voiceprint recognition model, the model training module is specifically used for:

In the optional embodiment of this application, the voiceprint recognition module is being input to the voiceprint recognition model with the voiceprint feature of waiting to discern, when obtaining the identity information of waiting to confirm identity livestock, specifically is used for:

In an optional embodiment of the present application, when determining the identity information corresponding to the voiceprint feature vector of the speech information to be recognized according to the updated identity recognition model, the voiceprint recognition module is specifically configured to:

In a third aspect, the present application provides a terminal device, including: a processor, a memory, and a bus; a bus for connecting the processor and the memory; a memory for storing operating instructions; a processor configured to execute the method as shown in any embodiment of the first aspect of the present application by calling an operation instruction.

In a fourth aspect, the present application provides a computer readable storage medium having stored thereon at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement a method as shown in any one of the embodiments of the first aspect of the application.

The technical scheme provided by the embodiment of the application has the following beneficial effects:

the livestock voiceprint recognition method, the livestock voiceprint recognition device, the terminal equipment and the computer storage medium can extract voiceprint features of voice information to be recognized based on the deep neural network, then identify the identity information of the livestock to be identified through the voiceprint recognition model, and due to the fact that the voiceprint features of the voice information to be recognized are extracted through the deep neural network, redundant information in the voice information can be removed, extraction accuracy of the voiceprint features is improved, and therefore accuracy of identification of the livestock is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.

Fig. 1 is a schematic flow chart of a livestock voiceprint recognition method provided by an embodiment of the application;

fig. 2 is a schematic structural diagram of a livestock voiceprint recognition device provided by the embodiment of the application;

fig. 3 is a schematic structural diagram of a terminal device according to an embodiment of the present application.

Detailed Description

Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary only for the purpose of explaining the present application and are not to be construed as limiting the present application.

As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.

The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.

Fig. 1 shows a flow chart of a livestock voiceprint recognition method provided by an embodiment of the application, and as shown in fig. 1, the method may include:

step S110, acquiring the voice information to be recognized of the livestock to be identified.

The to-be-recognized voice information of the livestock to be subjected to identity confirmation refers to the voice information of the livestock to be subjected to identity confirmation, the to-be-recognized voice information can be single-channel voice information or multi-channel voice information, and the to-be-recognized voice information can be collected through any equipment with a voice signal collecting function.

And step S120, extracting the voiceprint features to be recognized of the voice information to be recognized according to the deep neural network.

The deep neural network can be obtained by training according to a large amount of livestock voice information and corresponding voiceprint features in advance, and the deep neural network is used for obtaining the voiceprint features according to the voice information.

Step S130, inputting the voiceprint characteristics to be recognized into a voiceprint recognition model to obtain the identity information of the livestock to be identified, wherein the voiceprint recognition model is used for obtaining the identity information according to the voiceprint characteristics.

The voiceprint recognition model can be a preconfigured model used for obtaining identity information according to voiceprint characteristics, and can also be a model which is continuously updated.

According to the scheme in the embodiment of the application, the voiceprint features of the to-be-recognized voice information can be extracted based on the deep neural network, then the identity information of the to-be-recognized identity livestock is recognized through the voiceprint recognition model, due to the fact that the voiceprint features of the to-be-recognized voice information are extracted through the deep neural network, redundant information in the voice information can be removed, extraction accuracy of the voiceprint features is improved, and therefore the accuracy of livestock identity recognition is improved.

The original voiceprint recognition model can be a model trained on voiceprint characteristics of original voice information of a large number of livestock and corresponding identity information; the voiceprint recognition model is obtained by carrying out self-adaptive training on the original voiceprint recognition model according to the voiceprint characteristics of the livestock with the newly confirmed identity, namely, the original voiceprint recognition model is continuously updated according to the voiceprint characteristics of the livestock with the newly confirmed identity which is continuously added. In the voiceprint features of the original voice information and the corresponding identity information, the identity information may be coded information for identifying the identities of the livestock, for example, identity codes, and each livestock corresponds to a unique identity code. The voiceprint recognition model is a model which is continuously adaptively trained and updated on the original livestock voiceprint recognition model based on the voiceprint characteristics of the livestock with newly confirmed identity, so that the accuracy of livestock identity recognition is improved.

In an optional embodiment of the present application, the method may further include:

and carrying out self-adaptive training on the original voiceprint recognition model based on the voiceprint characteristics of the livestock with the newly confirmed identity to obtain the voiceprint recognition model.

In an optional embodiment of the present application, the original voiceprint recognition model is adaptively trained based on the voiceprint characteristics of the newly identified livestock to obtain the voiceprint recognition model, which may include:

adding voiceprint characteristics and identity information of the newly-identified livestock to a livestock voice information base, wherein the livestock voice information base is used for storing the corresponding relation between the voiceprint characteristics and the identity information, and the corresponding relation between the voiceprint characteristics and the identity information comprises the corresponding relation between original voice information and corresponding identity information;

and carrying out self-adaptive training on the original voiceprint recognition model according to all voiceprint characteristics and corresponding identity information in the livestock voice information base to obtain the voiceprint recognition model.

Wherein, can save the voiceprint characteristic of the speech information that different livestock identities correspond in the livestock speech information base, all voiceprint characteristics include the voiceprint characteristic of original speech information and the identity information that corresponds, and the voiceprint characteristic of the livestock of newly confirming the identity and the identity information that corresponds, when the voiceprint characteristic of the livestock of newly confirming the identity constantly adds the livestock speech information base, the voiceprint characteristic that represents the speech information of known identity livestock in the livestock speech information base constantly increases, and then make and can carry out continuous update to original voiceprint recognition model based on all voiceprint characteristics and the identity information that corresponds in the livestock speech information base, the accuracy of voiceprint recognition model discernment has been improved.

In an optional embodiment of the present application, the original voiceprint recognition model may include a voiceprint Vector extractor and an identity recognition model, the voiceprint Vector extractor is configured to obtain a voiceprint feature Vector (i-Vector) according to a voiceprint feature of the voice information, and the identity recognition model is configured to obtain the identity information according to the voiceprint feature Vector.

Wherein, because different voiceprint eigenvectors can reflect the voiceprint characteristics of different speech information, different voiceprint characteristics can reflect different identity information, then can confirm the identity information that speech information corresponds according to different voiceprint eigenvectors, through the voiceprint vector extractor and the identity recognition model that include in the original voiceprint recognition model, the identity information that the speech information of distinguishable livestock corresponds.

In the optional embodiment of this application, according to all voiceprint characteristics and the corresponding identity information in the livestock speech information base, to original voiceprint recognition model adaptive training, obtain the voiceprint recognition model, can include:

The method comprises the steps of obtaining acoustic fingerprint feature vectors of livestock, obtaining acoustic fingerprint feature vectors of the livestock, obtaining identity information of the livestock, and obtaining acoustic fingerprint feature vectors of the livestock based on the acoustic fingerprint feature vectors.

In an optional embodiment of the application, the identity recognition model may be a model obtained by training a PLDA matrix based on voiceprint feature vectors of livestock with different identities, and because a PLDA (Probabilistic Linear discriminant analysis) matrix is a covariance matrix, which may be used to represent a covariance between voice information of one livestock and voice information of other livestock, a difference between multi-channel voice information of one livestock and voice information of other livestock may be embodied based on the covariance matrix, and the PLDA covariance matrix is helpful to better extract information of livestock voice itself contained in i-Vector, and eliminate influence caused by channel difference as much as possible, therefore, the identity recognition model obtained by the training of the PLDA matrix identifies the voiceprint feature vectors of the livestock voice information, and may improve the recognition accuracy.

In the optional embodiment of this application, will wait to discern the voiceprint characteristic and input to the voiceprint recognition model, obtain the identity information of waiting to confirm identity livestock, can include:

Wherein, because different voiceprint eigenvectors can reflect the voiceprint characteristics of different speech information, different voiceprint characteristics can reflect different identity information, then can confirm the identity information that treats the discernment speech information and correspond according to different voiceprint eigenvectors, then based on the voiceprint eigenvector of the voiceprint characteristics of the speech information of treating the discernment, carry out livestock speech information's discernment through the identity recognition model after the renewal, can improve the accuracy of discernment.

In the optional embodiment of the application, because all voiceprint features and corresponding relations of identity information are stored in the livestock voice information base, whether the livestock to be identified is in the livestock voice information base or not can be judged according to the voiceprint features of the voice information to be identified, namely whether the voiceprint features of the voice information to be identified of the livestock to be identified have corresponding identity information in the livestock voice information base or not can be judged in the livestock voice information base.

In an optional embodiment of the present application, determining, according to the updated identity recognition model, identity information corresponding to a voiceprint feature vector of the speech information to be recognized may include:

The livestock voice information base comprises all voiceprint feature vectors and corresponding identity information, if the voice information of a certain livestock and the corresponding identity information are in the livestock voice information base, the probability value of the voiceprint feature vector of the voice information to be recognized belonging to the voiceprint feature vector corresponding to each voice information is calculated based on the voiceprint feature vector of the voice information to be recognized of the livestock, the identity information corresponding to the voice information with the probability value larger than a preset threshold value is determined as the identity information corresponding to the voice information to be recognized, and in practical application, if the voiceprint feature vectors corresponding to the voice information larger than the preset threshold value are multiple, the identity information corresponding to the voiceprint feature vector with the highest probability value is determined as the identity information corresponding to the voice information to be recognized.

In an optional embodiment of the present application, the original voiceprint recognition model is a model obtained by training a gaussian mixture model and a PLDA matrix by using a maximum expectation (em) (expectation maximization) method based on original voice information and corresponding identity information, and a specific training process of the model is as follows:

a1, determining a training sample, wherein the training sample comprises the voiceprint features of the voice information of each livestock and corresponding identity information, the D-dimensional voiceprint features of the voice information of each livestock can be extracted by using a deep neural network, and D is an integer not less than 2;

and A2, training the training sample through a Gaussian mixture model until convergence to obtain an original voiceprint recognition model.

In an optional embodiment of the present application, in a2, training the training sample through a gaussian mixture model until convergence to obtain an original voiceprint recognition model, which may include:

1. the likelihood probability corresponding to the D-dimensional voiceprint features in the training sample is expressed by k Gaussian components in a Gaussian mixture model, and the expression formula is as follows:

wherein,is a coefficient of,. pi_kThe distribution probability density of each Gaussian model; p (x) probability calculated from Gaussian mixture model for training sample, w_kAnd for the weight of the kth Gaussian model, each Gaussian component in the Gaussian mixture model corresponds to one Gaussian model, p (x | k) is the probability of the training sample obtained by calculation of the kth Gaussian model, and k is the number of the Gaussian models.

Then, the probability distribution of the ith gaussian component is:

wherein i is an integer of more than 0 and not more than k, mu_iIs the mean of the ith Gaussian model, sigma_iIs the covariance of the ith gaussian model.

The parameters of the ith gaussian model in the gaussian mixture model can be expressed as:

{w_i，μ_i，Σ_i} (3)

wherein, w_iIs the weight of the ith Gaussian model, mu_iIs the mean of the ith Gaussian model, sigma_iIs the covariance of the ith gaussian model.

2. The ith gaussian model is selected among the k gaussian models.

3. Obtaining a parameter sample X by using the ith Gaussian model, namely { w_i，μ_i，Σ_i}。

For convenience of calculation, θ ═ w is given₁，…，w_k,μ₁，…，μ_k,Σ₁，…，Σ_kAnd k parameter sets of Gaussian mixture distributions, namely the parameter sets formed by the parameters of the k Gaussian models.

4. Calculating to obtain a log-likelihood function of the parameter sample X according to the parameter sample X and the parameter set theta, and obtaining a local optimal numerical solution by the function through an EM (effective electromagnetic) algorithm;

the parameter sample X is an independent same-distribution sample set obeying Gaussian mixture distribution, and the formula of the maximum log-likelihood function of the parameter sample X is as follows:

wherein,N(x_i|μ_k,Σ_k)＝p_i(x)，n＝k。

since the summation term in the ln function in the above formula cannot directly obtain a closed-form solution, the objective function X can be estimated by using the unsupervised EM algorithm using the maximum likelihood, that is, the log-likelihood function is maximized by selecting the parameters, that is, a local optimal numerical solution can be obtained by using the EM algorithm.

In the solving process, the parameter model iteratively updated in each step is as follows:

wherein, w_iIs the weight of the ith Gaussian model, mu_iIs the mean of the ith Gaussian model, sigma_iIs the covariance of the ith Gaussian model;

p(i|x_jθ) is the posterior probability of the ith gaussian component, which is calculated as:

wherein, w_iIs the weight of the ith Gaussian component, i.e. the weight of the ith Gaussian model, p_i(x_i|θ_i) Is the probability of the ith Gaussian component, and k is the number of Gaussian components;

based on the above w_i、μ_iSum-sigma_iThree parametric models, incorporating the following formula:

continuously iterating until the maximum log likelihood value of the parameter sample XAnd (5) no longer changing to obtain the original voiceprint recognition model.

After the training of the original voiceprint recognition model is completed, a trained voiceprint vector extractor and an identity recognition model are obtained based on the obtained weight vector, constant vector, covariance matrix, a matrix of which the mean value is multiplied by covariance and the like.

Wherein the weight vector refers to the weight w of the Gaussian model_iA formed vector, wherein the constant vector refers to a vector formed by constants after the Gaussian mixture model training converges, and the covariance matrix refers to a covariance sigma of the Gaussian model_iThe matrix formed, the mean multiplied by the covariance matrix, refers to the covariance Σ from a gaussian model_iAnd mean value mu_iAnd multiplying the obtained matrix.

It can be understood that the above-mentioned mode of training the original voiceprint recognition model can be adopted, and the voiceprint recognition model is obtained by carrying out adaptive training on the original voiceprint recognition model based on the voiceprint characteristics of the livestock with the new identity confirmation.

In an optional embodiment of the present application, after obtaining the voiceprint feature of the to-be-recognized voice information, the voiceprint feature vector corresponding to the voiceprint feature of the to-be-recognized voice information may be determined based on the voiceprint vector extractor, and the determining may include:

selecting a target Gaussian model in the Gaussian mixture model according to the voiceprint vector extractor;

calculating the posterior probability of the voice information to be recognized according to the target Gaussian model;

determining a first order coefficient and a second order coefficient, and a first order term and a second order term according to the posterior probability;

and calculating to obtain the vocal print characteristic vector according to the first-order coefficient and the second-order coefficient, and the first-order term and the second-order term.

In an optional embodiment of the present application, selecting the target gaussian model in the gaussian mixture model may include:

calculating likelihood logarithm values of each frame of voice signal in the voice information to be recognized in k Gaussian models through parameters in the Gaussian mixture model to obtain k likelihood logarithm values;

forming a likelihood logarithm value matrix from the k likelihood logarithm values, and performing parallel sequencing on each column in the likelihood logarithm value matrix to obtain a sequencing result of the likelihood logarithm values;

selecting Gaussian models corresponding to the first N likelihood logarithm values as target Gaussian models, wherein N is a preset integer value;

the target Gaussian model is a likelihood logarithm matrix corresponding to the first N likelihood logarithm values of each frame of voice signal in the k Gaussian mixture models.

In an optional embodiment of the present application, calculating a posterior probability of the speech information to be recognized according to the target gaussian model may include:

performing X X X on each frame of voice signal X in voice information to be recognized^TCalculating to obtain a symmetric matrix, simplifying the symmetric matrix into a lower triangular matrix, and arranging elements in the lower triangular matrix into 1 row according to the element sequence to obtain an ordered matrix;

multiplying the sorted matrix by N to form a vector, wherein N is a positive integer not less than 2;

combining vectors corresponding to all frames of voice information in the voice information to be recognized into a data matrix, and simplifying a covariance matrix in a Gaussian mixture model into a lower triangular matrix to be changed into a matrix similar to the data matrix;

calculating a likelihood logarithm value of each frame of voice signal under the target Gaussian model through a mean matrix and a covariance matrix in the Gaussian mixture model, wherein the mean matrix is a matrix formed by means in the Gaussian mixture model;

performing Softmax regression calculation on the likelihood logarithm value to obtain a likelihood logarithm value after regression;

and normalizing the regressed likelihood logarithm value to obtain the posterior probability distribution of each frame of voice signal in a Gaussian mixture model, and forming a probability matrix by the probability distribution vector of each frame of voice signal, wherein the probability matrix is the posterior probability of the voice information to be recognized.

The likelihood logarithm value calculation formula is as follows:

wherein, the logkeys_iFor the ith row vector in a log-likelihood matrix formed of likelihood probability values, C_iIs a constant term of the ith Gaussian model, E_iIs the mean matrix of the ith Gaussian model, Cov_iIs a covariance matrix of the ith Gaussian model, X_iThe frame I voice information in the voice information to be recognized is obtained.

The formula for performing Softmax regression calculation on the likelihood logarithm value is as follows:

X_i＝Exp(X_i-max(X))/∑Exp(X_i-max(X)) (11)

wherein, X_iFor the ith value in the first row of the log-likelihood matrix, max (x) is the maximum value in the row vector.

In an alternative embodiment of the present application, determining the first order coefficient and the second order coefficient according to the posterior probability may include:

summing the probability matrix columns to determine a first order coefficient;

and transposing the probability matrix and multiplying the probability matrix by a data matrix to determine a second-order coefficient, wherein the data matrix is a matrix formed based on constant vectors in a Gaussian mixture model.

The method includes summing probability matrix columns to determine a first-order coefficient, and specifically includes:

the first order coefficients are calculated by the following formula:

wherein, Gamma_iIs the ith element of the first order coefficient vector; logrikes_jiIs the j th row and i th element of the probability matrix, and n is the column number of the probability matrix.

The transposing of the probability matrix and the multiplication of the probability matrix by the data matrix to determine a second-order coefficient specifically include:

the second order coefficient is calculated by the following formula:

X＝loglikes^T*feats (13)

wherein, X is a second-order coefficient matrix, logkeys is a probability matrix, and features is a data matrix.

In an optional embodiment of the present application, the voiceprint feature vector is obtained by calculation according to a first order coefficient, a second order coefficient, a first order term and a second order term, and a calculation formula is as follows:

ivector＝quadratic^-1*linear (14)

wherein, the calculation formula of the linear is as follows:

wherein M is_iIs the mean matrix of the ith Gaussian model in the Gaussian mixture model, sigma_iIs a covariance matrix of the ith Gaussian model, X_iIs the ith row vector of the second order coefficient matrix X, n is the profileThe number of columns of the rate matrix.

The formula for the calculation of quadratic is:

where m is the first order coefficient vector Gamma.

Based on the same principle as the method shown in fig. 1, the embodiment of the present application further provides an animal voiceprint recognition device 20, as shown in fig. 3, the animal voiceprint recognition device 20 may include: a voice information acquisition module 210, a voiceprint feature extraction module 220, and a voiceprint recognition module 230, wherein,

the voice information acquisition module 210 is used for acquiring the voice information to be recognized of the livestock to be identified;

the voiceprint feature extraction module 220 is configured to extract to-be-identified voiceprint features of to-be-identified voice information according to the deep neural network;

the voiceprint recognition module 230 is configured to input the voiceprint features to be recognized into the voiceprint recognition model to obtain the identity information of the livestock with the identity to be confirmed, and the voiceprint recognition model is configured to obtain the identity information according to the voiceprint features.

In an optional embodiment of the present application, the apparatus may further include:

In an optional embodiment of the present application, the voiceprint recognition module 230 is specifically configured to, when inputting the voiceprint features to be recognized into the voiceprint recognition model and obtaining the identity information of the livestock to be identified:

In an optional embodiment of the present application, when determining, according to the updated identity recognition model, the identity information corresponding to the voiceprint feature vector of the speech information to be recognized, the voiceprint recognition module 230 is specifically configured to:

The livestock voiceprint recognition device 20 of the present embodiment can execute the livestock voiceprint recognition method provided in any of the above embodiments of the present application, and the implementation principles thereof are similar, and are not repeated here.

Based on the same principle as the method shown in fig. 1, an embodiment of the present application further provides a terminal device 30, as shown in fig. 3, the terminal device 30 shown in fig. 3 includes: a processor 310 and a memory 330. Wherein the processor 310 is coupled to the memory 330, such as via a bus 320. Optionally, the terminal device 30 may further include a transceiver 340 for implementing data interaction between the terminal device 30 and other devices, and the transceiver 340 may include one or more receivers and transmitters. It should be noted that the structure of the terminal device 30 is not limited to the embodiment of the present application.

The processor 310 is applied to the embodiment of the present application, and is configured to implement the functions of the voice information obtaining module 210 and the voiceprint recognition module 220 shown in fig. 2.

The processor 310 may be a CPU, general purpose processor, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 310 may also be a combination of computing functions, e.g., comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.

Bus 320 may include a path that transfers information between the above components. The bus 320 may be a PCI bus or an EISA bus, etc. The bus 320 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 3, but this does not mean only one bus or one type of bus.

Memory 330 may be, but is not limited to, a ROM or other type of static storage device that can store static information and instructions, a RAM or other type of dynamic storage device that can store information and instructions, an EEPROM, a CD-ROM or other optical disk storage, optical disk storage (including compact disk, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.

Optionally, the memory 330 is used for storing application program codes for executing the present application, and the execution is controlled by the processor 310. The processor 310 is configured to execute application program codes stored in the memory 330 to realize the actions of the livestock voiceprint recognition device 20 provided by the embodiment shown in fig. 2.

Compared with the prior art, the terminal device 30 provided by the embodiment of the application can extract the voiceprint features of the to-be-identified voice information based on the deep neural network, then identify the identity information of the to-be-identified livestock through the voiceprint identification model, can remove the redundant information in the voice information due to the voiceprint features of the to-be-identified voice information extracted through the deep neural network, improves the extraction precision of the voiceprint features, and therefore improves the accuracy of the identification of the livestock identity.

The terminal device 30 provided in the embodiment of the present application is suitable for the device embodiments in the above embodiments, and has the same inventive concept and the same beneficial effects as the device embodiments, and is not described herein again.

Based on the same principle as the method shown in fig. 1, the present application also provides a computer-readable storage medium storing at least one instruction, at least one program, a code set, or a set of instructions, which is loaded and executed by a processor to implement the method shown in any one of the above method embodiments.

Compared with the prior art, the embodiment of the application provides a computer readable storage medium, and the scheme in the embodiment of the application can extract the voiceprint features of the to-be-identified voice information based on the deep neural network, then identifies the identity information of the to-be-identified livestock through the voiceprint identification model, can remove the redundant information in the voice information due to the voiceprint features of the to-be-identified voice information extracted through the deep neural network, improves the extraction precision of the voiceprint features, and therefore improves the accuracy of the identification of the livestock.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims

1. A livestock voiceprint recognition method is characterized by comprising the following steps:

acquiring voice information to be recognized of livestock to be identified;

extracting voiceprint features to be recognized of the voice information to be recognized according to a deep neural network;

and inputting the voiceprint features to be recognized into a voiceprint recognition model to obtain the identity information of the livestock with the identity to be confirmed, wherein the voiceprint recognition model is used for obtaining the identity information according to the voiceprint features.

2. The method of claim 1, wherein the voiceprint recognition model is a model obtained by adaptively training an original voiceprint recognition model based on voiceprint characteristics of a newly identified animal, and the original voiceprint recognition model is a model obtained by pre-training based on voiceprint characteristics of original voice information and corresponding identity information.

3. The method of claim 1, further comprising:

4. The method of claim 3 wherein adaptively training the original identity recognition model based on voiceprint characteristics of the newly-authenticated animal results in a voiceprint recognition model comprising:

adding the voiceprint characteristics and the identity information of the newly-identified livestock to a livestock voice information base, wherein the livestock voice information base is used for storing the corresponding relation between the voiceprint characteristics and the identity information, and the corresponding relation between the voiceprint characteristics and the identity information comprises the corresponding relation between the voiceprint characteristics of the original voice information and the corresponding identity information;

5. The method according to claim 4, wherein the original voiceprint recognition model comprises a voiceprint vector extractor and an identity recognition model, the voiceprint vector extractor is used for obtaining a voiceprint feature vector according to voiceprint features of the voice information, and the identity recognition model is used for obtaining identity information according to the voiceprint feature vector;

according to all voiceprint characteristics in the livestock voice information base and corresponding identity information, the original voiceprint recognition model is subjected to self-adaptive training to obtain a voiceprint recognition model, and the method comprises the following steps:

and carrying out self-adaptive training on the identity recognition model according to all the voiceprint feature vectors and corresponding identity information to obtain an updated identity recognition model, wherein the voiceprint recognition model comprises the voiceprint vector extractor and the updated identity recognition model.

6. The method of claim 5, wherein the inputting the voiceprint features to be recognized into a voiceprint recognition model to obtain the identity information of the livestock to be identified comprises:

7. The method according to claim 6, wherein the determining identity information corresponding to the voiceprint feature vector to be recognized according to the updated identity recognition model comprises:

determining the probability value of the voiceprint feature vector corresponding to each voice message in all the voice messages in the livestock voice message library, wherein the voiceprint feature vector to be recognized belongs to the updated identity recognition model;

and determining the identity information corresponding to the voice information with the probability value larger than a preset threshold value as the identity information corresponding to the voice information to be recognized.

8. A livestock voiceprint recognition device, comprising:

9. A terminal device, comprising:

a processor, a memory, and a bus;

the bus is used for connecting the processor and the memory;

the memory is used for storing operation instructions;

the processor is used for executing the method of any one of the claims 1 to 7 by calling the operation instruction.

10. A computer readable storage medium having stored thereon a computer program, the storage medium having stored thereon at least one instruction, at least one program, set of codes, or set of instructions, which is loaded and executed by a processor to implement the method of any of the preceding claims 1 to 7.