CN109816092B

CN109816092B - Deep neural network training method and device, electronic equipment and storage medium

Info

Publication number: CN109816092B
Application number: CN201811528375.XA
Authority: CN
Inventors: 柴振华; 孟欢欢
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2018-12-13
Filing date: 2018-12-13
Publication date: 2020-06-05
Anticipated expiration: 2038-12-13
Also published as: CN109816092A

Abstract

The application discloses a deep neural network training method, belongs to the technical field of computers, and is used for solving the problem that a neural network trained in the prior art is low in performance in a complex scene. The method comprises the following steps: acquiring a plurality of training samples provided with preset class labels, and training a neural network model based on the training samples; and the loss function of the neural network model is used for carrying out weighting operation according to a first weight value which is in direct proportion to the distinguishing difficulty of each training sample, and determining the loss value of the neural network model. According to the deep neural network training method disclosed by the embodiment of the application, the importance of the training samples with higher distinguishing difficulty in the training samples is improved in a self-adaptive manner, the neural network error classification obtained by training the samples with higher distinguishing difficulty is avoided, and the performance of the neural network is improved.

Description

Deep neural network training method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a deep neural network training method and apparatus, an electronic device, and a storage medium.

Background

In recent years, deep learning has made significant progress in the field of pattern recognition, and key factors thereof may include: the network model is rich and flexible, has strong computing capability and is more suitable for big data processing. As neural networks are applied to different tasks, improvements to neural network models are also a key issue for those skilled in the art to study. The improvement of the neural network model in the prior art mainly focuses on two aspects: network structure and loss function. Among them, the loss function commonly used in classification model training is mainly Softmax loss, and Center loss improved based on Softmax on the international top meeting ECCV in 2016. However, the applicant has found, through research on a neural network using a Center loss as a loss function in the prior art, that if a training sample for training the neural network includes a sample with high noise or a training sample with low discriminativity, and a model trained by using the existing loss function is tested, improvement of classification or recognition results of the trained neural network is limited. Therefore, deep neural network training is performed by combining the characteristics of the training sample, so that the performance of the neural network obtained by training can be improved, and the accuracy of classification and identification of the neural network obtained by applying training is improved.

Disclosure of Invention

The application provides a deep neural network training method which is beneficial to improving the performance of a neural network obtained through training, and therefore the accuracy of classification and identification of the neural network obtained through application training is improved.

In order to solve the above problem, in a first aspect, an embodiment of the present application provides a deep neural network training method, including:

acquiring a plurality of training samples provided with preset category labels;

training a neural network model based on the training samples;

the loss function of the neural network model is used for performing weighted operation according to a first weight value which is in direct proportion to the distinguishing difficulty of each training sample, so as to determine a loss value of the neural network model, the distinguishing difficulty of each training sample is in direct proportion to the distance between the training sample and a corresponding class center, and the corresponding class center is a class center of a class which contains the training samples and is obtained after the training samples are clustered.

In a second aspect, an embodiment of the present application provides a deep neural network training apparatus, including:

the training sample acquisition module is used for acquiring a plurality of training samples provided with preset class labels;

the model training module is used for training a neural network model based on the training samples;

and the loss function of the neural network model is used for carrying out weighting operation according to a first weight value which is in direct proportion to the distinguishing difficulty of each training sample, and determining the loss value of the neural network model.

In a third aspect, an embodiment of the present application further discloses an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the deep neural network training method according to the embodiment of the present application is implemented.

In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the deep neural network training method disclosed in the present application.

According to the deep neural network training method disclosed by the embodiment of the application, a neural network model is trained on the basis of a plurality of training samples by acquiring the training samples with preset class labels; the loss function of the neural network model is used for performing weighting operation according to the first weight value which is in direct proportion to the distinguishing difficulty of each training sample, so that the loss value of the neural network model is determined, and the problem that the performance of the neural network obtained by training in the prior art is low in a complex scene is solved. According to the deep neural network training method disclosed by the embodiment of the application, the importance of the training samples with higher distinguishing difficulty in the training samples is automatically improved by improving the loss function of the neural network, the neural network misclassification obtained by training the training samples with higher distinguishing difficulty is avoided, the performance of the neural network obtained by training is favorably improved, and the accuracy of classification and identification of the neural network obtained by application is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.

FIG. 1 is a flowchart of a deep neural network training method according to a first embodiment of the present application;

fig. 2 is a schematic diagram of a clustering result in the deep neural network training method according to the first embodiment of the present application;

FIG. 3 is a flowchart of object classification and recognition based on a neural network training method according to a first embodiment of the present application;

FIG. 4 is a schematic structural diagram of a deep neural network training device according to a third embodiment of the present application;

FIG. 5 is a second schematic structural diagram of a deep neural network training device according to a third embodiment of the present application;

fig. 6 is a third schematic structural diagram of a deep neural network training device according to a third embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Example one

As shown in fig. 1, the deep neural network training method disclosed in this embodiment includes: step 110 and step 120.

And step 110, obtaining a plurality of training samples provided with preset category labels.

Before training the neural network, a plurality of training samples provided with preset class labels need to be obtained.

The form of the training sample is different according to different specific application scenes. For example, in a workwear identification application, the training sample is a workwear image; in a human face living body detection application scene, training images of a living body human face and a non-living body human face (such as a human face model and a human face photo) acquired by sample image acquisition equipment; in a voice recognition application scenario, the training sample is a segment of audio.

The class labels of the training samples are different according to different specific recognition task outputs. Taking training a neural network for executing a work clothes recognition task as an example, according to a specific recognition task output, the category of the training sample may include category labels for indicating different work clothes categories such as American take-away work clothes and hundred-degree take-away work clothes. Taking training a neural network performing a voice recognition task as an example, the class of the training samples may include class labels indicating different voice classes for boys and girls, depending on the specific recognition task output. Taking the training of the neural network for performing the human face living body detection task as an example, the class of the training sample may include a class label indicating two classes, namely a living body human face and a non-living body human face.

And 120, training a neural network model based on the training samples.

In some embodiments of the present application, before training the neural network model, the method further comprises: clustering the training samples with the same class label, and determining the class center of the class training sample. The difficulty of distinguishing each of the training samples is proportional to the distance of the training sample from the center of the corresponding category. The specific method for clustering the training samples is referred to in the prior art, and is not described in the embodiments of the present application again.

After a training sample provided with a preset class label is obtained, a deep neural network is further constructed, and then a constructed neural network model is trained based on the training sample.

In particular, the subject network may select a ResNet (residual error network), such as a ResNet50 residual error network, and then improve the performance of the trained neural network by improving the loss function and the training process of the neural network. For example, the loss function of the neural network is set as a combined loss function composed of a softmax loss function and a central loss function based on an attention mechanism, and by improving the loss function of the neural network, when an output loss value of the loss function is calculated, a sample with a large distance from a clustering center is adaptively adjusted to influence the loss value of the loss function output, namely, the sample with a large distinguishing difficulty is adaptively adjusted to influence the loss value of the loss function output, so that the samples with the large distinguishing difficulty are prevented from being ignored in a model training process, and the accuracy of model classification or identification obtained by training is reduced.

In specific implementation, the loss function of the neural network model is taken as:

for example, among others, the compounds of formula,

the function of the loss is represented by,

expressed as softmax loss function:

for the central loss function based on the attention mechanism, it is expressed as:

in the above formula, i represents the identity of the training samples, m represents the total number of training samples, y_iRepresenting the class identity, x, of the neural network model input to the loss function_iRepresentation of belonging to category y_iIn the above-mentioned training sample, the training sample,

j-th column of the weight matrix representing the last fully connected layer before the loss function, b_jColumn j representing the last fully connected layer offset b before the loss function,

y-th of weight matrix representing last fully-connected layer before the loss function_iThe columns of the image data are,

y-th representing the last fully-connected layer offset b before the loss function_iThe columns of the image data are,

representing a training sample x_iFirst weight of (2), and cluster center

With m representing the number of training samples, T representing the transposition, and λ representing the scalar. γ represents a scalar quantity, and is a number greater than 0, e.g., γ is equal to 2.

In specific implementation, after preliminary classification, samples closer to the classification hyperplane are easily misclassified to other classes under the influence of noise, so that system judgment is wrong. As shown in fig. 2, the preliminary classification result diagram is shown, wherein the

samples

211, 212, 221, and 222 are closer to the classification hyperplane, i.e. farther from the respective class centers, under the influence of noise, the

samples

211 and 212 are easily misclassified into the class 22, and similarly, the

samples

221 and 222 are easily misclassified into the class 21. If the loss function of the neural network uses only the Softmax function, i.e. in the above formula

The discriminant is ignored (not only the separation but also a certain interval is needed) because only the distinctiveness between the samples is considered, so that the classification efficiency is low and the performance of the neural network model is reduced. In practical situations, for some updated samples, some samples closer to the classification hyperplane are easily misclassified to other classes under the influence of noise, so that the system is misjudged.

Therefore, the influence of the distance between the training sample and the class on the loss value is comprehensively considered in the prior art by adding a central loss function based on an attention mechanism. However, in the prior art, each training sample is treated consistently, and the influence of the training sample on the loss value output by the loss function is the same regardless of the distance between the training sample and the center of the class. Through trial and error, the inventors found that the class center distance between the

training samples

211 and 212 and the class 21 as shown in fig. 2 should not have the same effect on the loss value of the loss function. Taking a neural network for training a classification task of the frock clothes as an example, the American group frock clothes and the I-Chi arrival frock clothes belong to the category of the American group take-out frock clothes, the American group frock clothes are usually the frock clothes closer to a clustering center, and the I-Chi arrival frock clothes are classified into the frock clothes farther from the clustering center due to the small number of the I-Chi arrival frock clothes. If the loss function is calculated

When the loss value is not equal, the influence of the loss value output by the order-me arrival frock clothes and the beauty team frock clothes is the same, namely, the order-me arrival frock clothes and the beauty team frock clothes have the same sample weight, so that the trained neural network can wrongly classify the order-me arrival frock clothes and the beauty team frock clothes into different classes when identifying the frock clothes. Therefore, in the embodiment, the corresponding weight is set for the training sample according to the distance between the training sample and the clustering center in a self-adaptive manner, so that the training sample pair loss function with high distinguishing difficulty is improved

The influence of the loss value of (c). In practice, training sample x_iWeight of (2), and clustering center

Is proportional to the distance between them.

In the specific training process, the computer calculates a loss function value corresponding to each input training sample by executing the loss function, and compares the calculated loss function value with the difference between sample labels; then, by continuously adjusting the parameters of the loss function, the loss function values of different parameters corresponding to each input training sample are repeatedly calculated, and finally, the parameter which satisfies the minimum difference between the loss function values and the sample labels is determined and used as the parameter of the loss function. And calculating an output value corresponding to the sample input to the neural network based on the determined parameter. The output value of the neural network corresponds to the classification result of the input sample.

During specific training, the neural network can be optimized through a reverse conduction and gradient descent method, so that the loss value output by the loss function is minimum, and the optimized neural network is obtained.

In some embodiments of the present application, the method for adjusting the clustering center by the gradient descent method may be further optimized, and the influence of the training samples when adjusting the clustering center is set by considering the differentiation difficulty of the training samples, so as to further improve the performance of the network model obtained by training.

The specific training process of the neural network can be referred to in the prior art, and is not described in detail in this embodiment.

When the method is specifically implemented, other network structures can be used as the basic network structure of the neural network, the optimization training of the loss function exists in the training process of any network structure, the specific network structure of the neural network is not limited, and only the realization and optimization method of the loss function is limited.

The deep neural network training method disclosed by the embodiment of the application comprises the steps of obtaining a plurality of training samples provided with preset class labels, clustering the training samples with the same labels, and determining class centers; then, training a neural network model based on the training samples; the loss function of the neural network model is used for performing weighting operation according to the first weight value which is in direct proportion to the distinguishing difficulty of each training sample, so that the loss value of the neural network model is determined, and the problem that the performance of the neural network obtained by training in the prior art is low in a complex scene is solved. According to the deep neural network training method disclosed by the embodiment of the application, the importance of the training samples with higher distinguishing difficulty in the training samples is improved in a self-adaptive mode by improving the loss function of the neural network, the neural network error classification obtained by training the training samples with higher distinguishing difficulty is avoided, and the performance of the neural network obtained by training is improved.

Example two

Based on the first embodiment, the present embodiment discloses an optimization scheme of a deep neural network training method.

In specific implementation, after a plurality of training samples provided with preset class labels are obtained, a neural network is constructed at first. In this embodiment, a ResNet50 (residual error network) is still used as a base network to construct a neural network, which includes a plurality of feature extraction layers. The neural network calls forward functions of each feature extraction layer (such as a full connection layer) in sequence through a forward propagation stage to obtain output layer by layer, the last layer is compared with a target function to obtain a loss function, and an error updating value is calculated. Then, layer by layer, the first layer is reached by back propagation, and all weights are updated together at the end of back propagation. And the last feature extraction layer takes the extracted features as a loss function of a predicted value input value of the neural network, and the loss function obtains a difference value between the predicted value and the real label through a series of calculations and determines the difference value as a loss value of the neural network. The purpose of training the neural network is to minimize the difference between the predicted values and the true labels.

In other preferred embodiments of the present application, the first weight value

It can be represented by a normal distribution function of the distance between the training sample and the class center, for example:

where σ is a constant, x_iRepresentation of belonging to category y_iIn the above-mentioned training sample, the training sample,

represents a category y_iOf the center of (c). As can be seen from the above formula, the greater the difficulty of distinguishing the training samples, i.e., the training sample x_iCenter of class to which it belongs

The greater the distance of (A), the first weight value

The larger the value. Namely, the greater the difficulty in distinguishing the training samples, the greater the importance of the training samples needs to be raised when training the neural network.

In other embodiments of the present application, calculating the first weight may also be expressed by other direct proportional relation formulas of the training samples and the distance between the training samples and the class centers, which is not illustrated in this embodiment.

Specifically, in the embodiment, in the process of back propagation, the loss representing the error between the actual values of the prediction value ranges of the training samples output by the loss function is caused by continuously adjusting the clustering centerThe value is minimal. When implemented, it is usually represented by the formula

To update the cluster center

The updating amount of the category center is updated

When the training samples to be trained close to the class center are considered in an important mode, the training samples far away from the class center are weakened. Therefore, when the class center is updated, the neural network model is used for performing weighting operation according to a second weight value which is inversely proportional to the distinguishing difficulty of each training sample, and determining the variation of the corresponding class center. Taking the classification result shown in fig. 2 as an example, the update amount of the class center of the class 22 is updated in the calculation value

In time, the difficulty of distinguishing the important reference rate similar samples 222 is small, that is, the training samples closer to the center of the category contribute to the update amount. While similar samples 221 are more difficult to distinguish, i.e., training samples farther from the center of the class will contribute less to the update volume. In specific implementation, the updating amount of the class center can be calculated by setting corresponding weights for different training samples according to the distance between the training sample and the class center to which the training sample belongs

For example, the determining the variation of the corresponding category center according to the weighted operation according to the second weight inversely proportional to the difficulty of distinguishing of each training sample includes: according to the formula

Determining a class center c_jWherein i represents the identity of the training sample, m represents the trainingTotal number of training samples, j and y_iRepresenting the class identity, x, of the neural network model input to the loss function_iRepresentation of belonging to category y_iA training sample of q_iRepresents a second weight value, q_iAnd training sample x_iAnd a class center c_jThe distance between the two is in inverse proportion, delta () is a dirac function, when the condition in brackets is satisfied, delta is equal to 1, and vice versa is equal to 0, α_cA scalar quantity for controlling the learning rate of the class center, and the value range is [0, 1 ]]。

With reference to the first embodiment, the loss function is expressed as:

wherein the central loss function based on the attention mechanism is expressed as:

for example, a specific scheme for adjusting the cluster center is described.

Determining the variation of the category center by calculating the partial derivative of the center loss value to the sample, wherein the derivation process of the specific formula is as follows:

further, in the above-mentioned case,

this yields:

wherein,

σ_cis a constant number, x_iRepresentation of belonging to category y_iIn the above-mentioned training sample, the training sample,

represents a category y_iOf the center of (c). That is, in some embodiments of the present application, the second weight q_iCan be represented by a normal distribution function of the distance between the training sample and the center of the class.

In other embodiments of the present application, the second weight q_iThe distance between the training sample and the center of the category can be expressed by other inverse proportional relation formulas, which is not illustrated in this embodiment. Preferably, the calculation method of the second weight is matched with the calculation method of the first weight.

When the class center is updated, the influence of the training samples which are difficult to distinguish on the class center is weakened, and meanwhile, the influence of the training samples which are easy to distinguish on the class center is improved, so that the classification accuracy of the neural network obtained by training can be further improved, and the performance of the neural network model is improved.

In further preferred embodiments of the present application, the loss function of the neural network model is used to: when the neural network model calculates the loss value of the training sample, the loss value of the training sample is adjusted through a third weight value inversely proportional to the classification proportion of the training sample.

Specifically, the loss function of the neural network model is represented as:

wherein,

representing a loss function, i representing the identity of the training samples, m representing the total number of training samples, y_iRepresenting the class identity, x, of the neural network model input to the loss function_iRepresentation of belonging to category y_iIn the above-mentioned training sample, the training sample,

before representing said loss functionJ column of the weight matrix of the last fully connected layer, b_jColumn j representing the last fully connected layer offset b before the loss function,

y-th representing the last fully-connected layer offset b before the loss function_iColumn, (1-k)_i)^γRepresenting a training sample x_iFirst weight of k_iA value of a number greater than 0 and less than 1, and k_iAnd training sample x_iAnd training sample x_iThe distance of the centers of the categories is inversely proportional,

representing a training sample x_iThe third weight value of (a) is,

value and category y of_iThe ratio of the training samples of (a) to (b) in the full training samples is inversely proportional, T denotes transposition, and λ and γ denote scalars. In specific practice, k_iIt may be formulated by other inverse proportional relationships of the distance of the training sample from the center of the class.

EXAMPLE III

The embodiment of the application also discloses a deep neural network training method which is applied to classification application. As shown in fig. 3, the method includes: step 310 to step 370.

In step 310, a plurality of training samples with preset category labels are obtained.

In a specific implementation of the present application, the training samples include any one of: image, text, speech. Aiming at different objects to be classified, training samples of the corresponding objects to be classified need to be obtained during neural network modeling. In this embodiment, taking training of a neural network model for worker garment identification as an example, first, a worker garment image provided with different platform labels is obtained, for example: the service uniform image is provided with a beautiful group takeout platform label, the service uniform image is provided with a hungry platform label, the service uniform image is provided with a hundred-degree takeout platform label, and the like.

And 320, clustering the training samples with the same class labels, and determining the class center of the class training sample.

The difficulty of distinguishing each of the training samples is proportional to the distance of the training sample from the center of the corresponding category.

For the second embodiment, reference is made to cluster the training samples with the same class label, and determine the class center of the training sample of the class, which is not described in detail in this embodiment.

Step 330, training the neural network model based on the training samples.

Thereafter, a neural network model is trained based on the acquired uniform images.

When training a neural network model based on the acquired work clothes image, firstly clustering training samples with the same class label in the acquired training samples, and determining a class center corresponding to the training sample with each class label. For the training sample in each category obtained by clustering, the distance between the sample and the center of the category represents the distinguishing difficulty of the sample: the greater the distance between the sample and the center of the category, the greater the difficulty in distinguishing the sample; conversely, the smaller the distance between the sample and the center of the category, the less difficult the sample is to distinguish.

The training process of the neural network is that the computer calculates the loss function value corresponding to each training sample input by executing the loss function, and compares the calculated loss function value with the difference between the sample labels; and then, repeatedly calculating the loss function value of different parameters corresponding to each input training sample by continuously adjusting the parameters of the loss function, and finally determining the parameter which meets the minimum difference between the loss function value and the sample label and is used as the parameter of the loss function.

For a specific implementation of training the neural network model based on the obtained work clothes image, reference is made to the foregoing first embodiment and second embodiment, which are not described again in this embodiment.

And 340, acquiring object data of the object to be classified matched with the training sample through data acquisition equipment.

And for different objects to be classified, acquiring object data of the objects to be classified through corresponding data acquisition equipment. For example, when the object to be classified is a work clothes, images of the takeaway personnel can be acquired through the camera to obtain work clothes image data.

Step 350, obtaining the classification characteristics of the object data.

According to the same method as that in the model training, the feature vector of the work clothes image data is obtained and used as the classification feature.

And step 360, inputting the classification features into the trained neural network model to obtain an output result of the neural network model.

Inputting the obtained classification features into the neural network model trained in the signing step to obtain an output result of the neural network model.

And step 370, executing preset operation according to the output result.

In some embodiments of the present application, the performing a preset operation according to the output result includes any one or more of: displaying the classification result of the object to be classified corresponding to the output result; outputting an entrance guard control signal according to the output result; and executing the dispatching operation according to the output result.

Specifically, for the work clothes identification application scene, when the object to be identified is a work clothes, the step classifies the output acquired work clothes image into the confidence of the corresponding class label. In specific implementation, for the American group take-out platform, when a certain take-out person is identified to wear the American group work clothes, the order sending operation can be executed on the take-out person; when recognizing that a certain takeaway person wears hundred-degree industrial clothes, the order dispatching operation is not executed on the takeaway person.

According to the neural network training method disclosed by the application, in the process of training the neural network for the work clothes recognition task, as the training samples of the Mei-Ju work clothes are very many, the training samples of the Diao-Ju work clothes are fewer, and the Mei-Ju work clothes and the Diao-Ju work clothes have the same class labels, the influence of the training samples of the Dia-Ju work clothes on the sample training process can be properly promoted, namely, the influence of the Dia-Ju work clothes on the loss value of the output of the loss function is promoted, so that the phenomenon that the class samples are ignored in the classification process due to the fact that a certain class of samples are few is avoided, and the phenomenon that the trained neural network.

The inventor trains the neural network model which is constructed based on different loss functions and executes the work clothes recognition task based on the same training data set, and tests the trained neural network through the same testing data set. For example, when a mei-qu-sent actual spot check chart of a rider is used, namely, mei-qu-i-i. When attack testing is carried out by taking a non-masque special delivery rider spot check chart (from a part of crowdsourcing riders, a registration chart and a rider spot check chart wearing other work uniforms), namely taking image characteristics of the non-masque work uniforms and image characteristics of non-point I-me work uniforms as test samples, the probability of classifying the neural network constructed based on the softmax loss function into the masque work uniforms is 0.41%, the probability of classifying the neural network constructed based on the softmax loss function and the central loss function based on the attention mechanism in the prior art is also 0.41%, the probability of classifying the neural network constructed based on the softmax loss function and the central loss function based on the attention mechanism combined with the weighting operation disclosed by the application into the masque work uniforms is 0.40%, and the classification error rate is obviously reduced.

The neural network training method disclosed by the embodiment of the application comprises the steps of obtaining a plurality of training samples with preset class labels, clustering the training samples with the same class labels, determining class centers of the class training samples, training a neural network model based on the training samples, obtaining object data of an object to be classified matched with the training samples through data acquisition equipment in an online application process, and further obtaining classification characteristics of the object data; and finally, executing preset operation according to the output result, wherein a loss function of the neural network model is used for performing weighting operation according to a first weight value which is in direct proportion to the distinguishing difficulty of each training sample to determine a loss value of the neural network model, and the distinguishing difficulty of each training sample is in direct proportion to the distance between the training sample and the corresponding class center, so that the accuracy of determining object classification based on the model obtained by training can be improved, and the preset operation can be accurately executed.

Example four

As shown in fig. 4, the deep neural network training apparatus disclosed in this embodiment includes:

a training sample obtaining module 410, configured to obtain a plurality of training samples with preset category labels;

a model training module 420 for training a neural network model based on the training samples;

Optionally, as shown in fig. 5, the apparatus further includes:

the clustering module 430 is configured to cluster the training samples with the same class label, and determine a class center of the class training sample;

Optionally, when the category center is updated, the neural network model is configured to perform a weighted operation according to a second weight inversely proportional to the difficulty of distinguishing between the training samples, and determine a variation of the corresponding category center.

Optionally, the loss function of the neural network model is used to: when the neural network model calculates the loss value of the training sample, the loss value of the training sample is adjusted through a third weight value inversely proportional to the classification proportion of the training sample.

Further optionally, the loss function of the neural network model is represented as:

wherein, among others,

representing a training sample x_iThe third weight value of (a) is,

value and category y of_iThe ratio of the training samples of (a) to (b) in the full training samples is inversely proportional, T denotes transposition, and λ and γ denote scalars.

In other preferred embodiments of the present application,

The greater the distance of (a), k_iThe smaller, the first weight

The larger the value. I.e. the area of the training sampleThe greater the degree of difficulty, the greater the importance of the neural network needs to be raised correspondingly when training.

In the process of back propagation, the loss value which is output by the loss function and represents the error between the actual values of the prediction value range of the training sample is minimized by continuously adjusting the clustering center. When implemented, it is usually represented by the formula

To update the cluster center

The updating amount of the category center is updated

When the training samples to be trained close to the class center are considered in an important mode, the training samples far away from the class center are weakened. Further optionally, the performing a weighted operation according to a second weight inversely proportional to the difficulty of distinguishing between the training samples, and determining the variation of the corresponding class center includes:

according to the formula

Determining a class center c_jWherein i represents the identity of the training samples, m represents the total number of training samples, j and y_iRepresenting the class identity, x, of the neural network model input to the loss function_iRepresentation of belonging to category y_iA training sample of q_iRepresents a second weight value, q_iAnd training sample x_iAnd a class center c_jThe distance between the two is in inverse proportion, delta () is a dirac function, when the condition in brackets is satisfied, delta is equal to 1, and vice versa is equal to 0, α_cA scalar quantity for controlling the learning rate of the class center, with a value range of [0, 1%]。

In some embodiments of the present application,

σ_cis a constant number, x_iIndicates belonging to a categoryy_iIn the above-mentioned training sample, the training sample,

represents a category y_iOf the center of (c).

Optionally, the training sample includes any one of: image, text, speech.

Optionally, as shown in fig. 6, the apparatus further includes:

the data acquisition module 440 is configured to acquire, through a data acquisition device, object data of an object to be classified that is matched with the training sample;

a feature obtaining module 450, configured to obtain a classification feature of the object data;

the model calling module 460 is configured to input the classification features to the trained neural network model, so as to obtain an output result of the neural network model;

and the execution module 470 is configured to execute a preset operation according to the output result.

Optionally, the executing of the preset operation according to the output result includes any one or more of the following:

displaying the classification result of the object to be classified corresponding to the output result;

outputting an entrance guard control signal according to the output result;

and executing the dispatching operation according to the output result.

The deep neural network training device disclosed in the embodiment of the present application is used to implement the steps of the deep neural network training method described in the first and second embodiments of the present application, and the specific implementation of each module of the device refers to the corresponding step, which is not described herein again.

The deep neural network training device disclosed by the embodiment of the application determines the class center of the class training sample by acquiring a plurality of training samples provided with preset class labels and clustering the training samples with the same class labels, and trains a neural network model based on the training samples; the loss function of the neural network model is used for performing weighting operation according to the first weight value which is in direct proportion to the distinguishing difficulty of each training sample, so that the loss value of the neural network model is determined, and the problem that the performance of the neural network obtained by training in the prior art is low in a complex scene is solved. The deep neural network training device disclosed by the embodiment of the application improves the loss function of the neural network, adaptively improves the importance of the training sample with larger difficulty in the training sample, avoids the error classification of the network model obtained by training the training sample with larger difficulty, and is beneficial to improving the performance of the neural network obtained by training.

In the neural network training device disclosed by the embodiment of the application, in the online application process, object data of an object to be classified matched with the training sample is obtained through data acquisition equipment, and the classification characteristic of the object data is further obtained; and finally, executing preset operation according to the output result, wherein a loss function of the neural network model is used for performing weighting operation according to a first weight value which is in direct proportion to the distinguishing difficulty of each training sample to determine a loss value of the neural network model, and the distinguishing difficulty of each training sample is in direct proportion to the distance between the training sample and the corresponding class center, so that the accuracy of determining object classification based on the model obtained by training can be improved, and the preset operation can be accurately executed.

Correspondingly, the application also discloses an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to implement the deep neural network training method according to the first embodiment to the third embodiment of the application. The electronic device can be a PC, a mobile terminal, a personal digital assistant, a tablet computer and the like.

The present application also discloses a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the deep neural network training method as described in embodiments one to three of the present application.

The embodiments in the present specification are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.

The deep neural network training method and device provided by the application are introduced in detail, specific examples are applied in the method to explain the principle and the implementation mode of the application, and the description of the embodiments is only used for helping to understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Claims

1. A deep neural network training method, comprising:

acquiring a plurality of training samples provided with preset category labels;

clustering the training samples with the same class label, and determining the class center of the class training sample; the distinguishing difficulty of each training sample is in direct proportion to the distance of the training sample from the center of the corresponding category;

training a neural network model based on the training samples;

the loss function of the neural network model is used for carrying out weighting operation according to a first weight value which is in direct proportion to the distinguishing difficulty of each training sample, and determining a loss value of the neural network model; the training sample comprises any one of the following items: work clothes images, live face and non-live face images, texts and voice audio; the category label includes any one of: class labels for different classes of work clothes, class labels indicating different sound classes for boys and girls, class labels indicating two classes, live faces and non-live faces.

2. The method of claim 1, wherein in updating the class centers, the neural network model is configured to perform a weighting operation according to a second weight inversely proportional to the difficulty of distinguishing between the training samples, and determine the variation of the corresponding class centers.

3. The method of claim 2, wherein the loss function of the neural network model is used to: when the neural network model calculates the loss value of the training sample, the loss value of the training sample is adjusted through a third weight value inversely proportional to the classification proportion of the training sample.

4. The method of claim 3, wherein the loss function of the neural network model is represented as:

wherein,

representing a loss function, i representing the identity of the training samples, m representing the total number of training samples, y_iRepresents the aboveClass identification, x, of the neural network model input to the loss function_iRepresentation of belonging to category y_iIn the above-mentioned training sample, the training sample,

y-th representing the last fully-connected layer offset b before the loss function_iColumn, (1-k)_i)^γRepresenting a training sample x_iFirst weight of k_iA value of a number greater than 0 and less than 1, and k_iAnd training sample x_ix_iThe distance of the centers of the categories is inversely proportional,

representing a training sample x_iThe third weight value of (a) is,

5. The method of claim 2, wherein the determining the variation of the corresponding class center according to the weighted operation according to the second weight inversely proportional to the distinguishing difficulty of each training sample comprises:

according to the formula

6. The method of claim 1, further comprising:

acquiring object data of an object to be classified matched with the training sample through data acquisition equipment;

obtaining classification features of the object data;

inputting the classification features into the trained neural network model to obtain an output result of the neural network model;

and executing preset operation according to the output result.

7. The method of claim 6, wherein the performing the preset operation according to the output result comprises any one or more of:

outputting an entrance guard control signal according to the output result;

and executing the dispatching operation according to the output result.

8. A deep neural network training apparatus, comprising:

the clustering module is used for clustering the training samples with the same class labels and determining the class center of the class training samples; the distinguishing difficulty of each training sample is in direct proportion to the distance of the training sample from the center of the corresponding category;

the loss function of the neural network model is used for carrying out weighting operation according to a first weight value which is in direct proportion to the distinguishing difficulty of each training sample, and determining a loss value of the neural network model;

the training sample comprises any one of the following items: work clothes images, live face and non-live face images, texts and voice audio; the category label includes any one of: class labels for different classes of work clothes, class labels indicating different sound classes for boys and girls, class labels indicating two classes, live faces and non-live faces.

9. The apparatus of claim 8, wherein in updating the class centers, the neural network model is configured to perform a weighting operation according to a second weight inversely proportional to the difficulty of distinguishing between the training samples, so as to determine the variation of the corresponding class centers.

10. The apparatus of claim 9, wherein the loss function of the neural network model is to: when the neural network model calculates the loss value of the training sample, the loss value of the training sample is adjusted through a third weight value inversely proportional to the classification proportion of the training sample.

11. The apparatus of claim 10, wherein the loss function of the neural network model is represented as:

wherein,

y-th representing the last fully-connected layer offset b before the loss function_iColumn, (1-k)_i)^γRepresenting a training sample x_iFirst weight of k_iA value of a number greater than 0 and less than 1, and k_iAnd training sample x_i，x_iThe distance of the centers of the categories is inversely proportional,

representing a training sample x_iThe third weight value of (a) is,

12. The apparatus of claim 8, wherein the weighting according to the second weight inversely proportional to the difficulty of distinguishing between the training samples, and the determining the variation of the corresponding class center comprises:

according to the formula

Determining a class center c_jWherein i represents the identity of the training samples, m represents the total number of training samples, j and y_iRepresenting the class identity, x, of the neural network model input to the loss function_iRepresentation of belonging to category y_iA training sample of q_iRepresents a second weight value, q_iAnd training sample x_iAnd a class center c_jThe distance between delta () is inversely proportional to the dirac function, delta is 1 when the condition in parentheses is satisfied, and is otherwise 0, a_cA scalar quantity for controlling the learning rate of the class center, with a value range of [0, 1%]。

13. The apparatus of claim 8, further comprising:

the data acquisition module is used for acquiring object data of the object to be classified matched with the training sample through data acquisition equipment;

the characteristic acquisition module is used for acquiring the classification characteristic of the object data;

the model calling module is used for inputting the classification features to the trained neural network model to obtain an output result of the neural network model;

and the execution module is used for executing preset operation according to the output result.

14. The apparatus of claim 13, wherein the performing of the preset operation according to the output result comprises any one or more of:

outputting an entrance guard control signal according to the output result;

and executing the dispatching operation according to the output result.

15. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the deep neural network training method of any one of claims 1 to 7 when executing the computer program.

16. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the deep neural network training method of any one of claims 1 to 7.