CN108510083A - A kind of neural network model compression method and device - Google Patents
A kind of neural network model compression method and device Download PDFInfo
- Publication number
- CN108510083A CN108510083A CN201810274146.3A CN201810274146A CN108510083A CN 108510083 A CN108510083 A CN 108510083A CN 201810274146 A CN201810274146 A CN 201810274146A CN 108510083 A CN108510083 A CN 108510083A
- Authority
- CN
- China
- Prior art keywords
- network model
- neural network
- feature vector
- similarity
- compressed
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of neural network model compression method and devices, wherein this method includes:Training data is inputted into neural network model and target nerve network model to be compressed;The feature vector and classification results extracted to training data based on neural network model to be compressed, are trained target nerve network model, obtain compression neural network model;Wherein, the quantity of target nerve network model parameter is less than the quantity of neural network model parameter to be compressed.The feature vector and classification results that the embodiment of the present invention extracts training data based on neural network model to be compressed, guiding target neural network model is trained, finally obtained compression neural network model and neural network model to be compressed are identical to the classification results of same training data, into the loss without causing precision during model compression, it can be under the premise of ensureing precision, the size of model is compressed, the dual requirements for precision and moulded dimension are met.
Description
Technical field
The present invention relates to machine learning techniques field, in particular to a kind of neural network model compression method and
Device.
Background technology
Fast development with neural network in fields such as image, voice, texts has pushed a series of falling for intellectual products
Ground.In order to allow the feature of the more preferable learning training data of neural network with lift scheme effect, mutually it is applied to indicate neural network mould
The parameter of type increases rapidly, and the number of plies of neural network is continuously increased, and leads to deep neural network model there is parameters numerous, mould
Type training and the computationally intensive deficiency of application process;This causes the product based on neural network to rely on server end operation energy mostly
The driving of power causes the application range of neural network model to be restricted highly dependent upon good running environment and network environment,
Such as it cannot achieve Embedded Application.In order to realize the Embedded Application of neural network model, need neural network model
Below volume compression to a certain range.
Current model compression method generally comprises following several:First, beta pruning, namely after the complete large-sized model of training, go
The parameter for falling weight very little in network model then proceedes to be trained model;Second, reaching reduction ginseng by the way that weights are shared
The purpose of number quantity;Third, quantization, it is however generally that, the floating type number for the 32bit length that the parameter of neural network model is all
It indicates, need not actually retain so high precision, can indicate original 32 bit institutes by quantization, such as with 0~255
The precision of expression reduces the space occupied required for each weights by sacrificing precision.Fourth, neural network binaryzation,
Also the parameter of network model is used into binary number representation, to achieve the purpose that reduce model size.
But above-mentioned several method is all that model compression is directly directly carried out on model to be compressed, and to sacrifice model
Model compression is carried out premised on precision, is often unable to reach the use demand to precision.
Invention content
In view of this, the embodiment of the present invention is designed to provide a kind of neural network model compression method and device,
The size of model can be compressed in the case where ensureing neural network model precision.
In a first aspect, an embodiment of the present invention provides a kind of neural network model compression method, this method includes:
Training data is inputted into neural network model and target nerve network model to be compressed;
The feature vector and classification results that the training data is extracted based on the neural network model to be compressed, to mesh
Mark neural network model is trained, and obtains compression neural network model;
Wherein, the quantity of the target nerve network model parameter is less than the number of the neural network model parameter to be compressed
Amount.
Second aspect, the embodiment of the present invention also provide a kind of neural network model compression set, which includes:
Input module, for training data to be inputted neural network model and target nerve network model to be compressed;
Training module, feature vector for extract to the training data based on the neural network model to be compressed with
Classification results are trained target nerve network model, obtain compression neural network model;
Wherein, the quantity of the target nerve network model parameter is less than the number of the neural network model parameter to be compressed
Amount.
Neural network model compression method and device provided by the embodiments of the present application, to neural network model to be compressed
When compression, the quantity of advance one parameter of framework of meeting is less than the target of the quantity of neural network model parameter to be compressed
Then training data is input in neural network model and target nerve network model to be compressed by neural network, based on waiting for
The feature vector and classification results, guiding target neural network model that compression neural network model extracts training data are instructed
Practice, obtains compression neural network model, finally obtained compression neural network model and neural network model to be compressed are to same
The classification results of training data should be identical, and then the loss of precision will not be caused during model compression, thus
The size of model can be compressed under the premise of ensureing precision, meets the dual requirements for precision and moulded dimension.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment cited below particularly, and coordinate
Appended attached drawing, is described in detail below.
Description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached
Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair
The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this
A little attached drawings obtain other relevant attached drawings.
Fig. 1 shows the flow chart for the neural network model compression method that the embodiment of the present application one provides;
Fig. 2 shows a kind of being divided training data based on neural network model to be compressed of the offer of the embodiment of the present application two
Class as a result, the specific method that target nerve network model is trained flow chart;
Fig. 3 shows a kind of model compression process schematic that the embodiment of the present application two provides;
Fig. 4 shows the embodiment of the present application three provides first flow chart for comparing operation;
Fig. 5 shows that the embodiment of the present application four also provides similar to first eigenvector and the progress of second feature vector
Degree matches, and carries out the specific method flow chart of this training in rotation to target nerve network according to the result of similarity mode;
Fig. 6 shows that the similarity that the embodiment of the present application four provides determines the flow chart of operation;
Fig. 7 show that the embodiment of the present application five provides another to first eigenvector and second feature vector into
Row similarity mode, and the flow of the specific method of this training in rotation is carried out according to the result of similarity mode to target nerve network
Figure;
Fig. 8 shows that the similarity that the embodiment of the present application five provides determines the flow chart of operation;
Fig. 9 shows the flow chart for the neural network model compression method that the embodiment of the present application six provides;
Figure 10 shows the structural schematic diagram for the neural network model compression set that the embodiment of the present application seven provides;
Figure 11 shows a kind of structural schematic diagram for computer equipment that the embodiment of the present application eight provides.
Specific implementation mode
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
Middle attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only
It is a part of the embodiment of the present invention, instead of all the embodiments.The present invention being usually described and illustrated herein in the accompanying drawings is real
Applying the component of example can be arranged and designed with a variety of different configurations.Therefore, below to provide in the accompanying drawings the present invention
The detailed description of embodiment is not intended to limit the range of claimed invention, but is merely representative of the selected reality of the present invention
Apply example.Based on the embodiment of the present invention, the institute that those skilled in the art are obtained without making creative work
There is other embodiment, shall fall within the protection scope of the present invention.
For ease of understanding the present embodiment, first to a kind of neural network model pressure disclosed in the embodiment of the present invention
Contracting method describes in detail, and this method can be used for the compression of the size to various neural network models.
Neural network model compression method shown in Figure 1, that the embodiment of the present application one provides, including:
S101:Training data is inputted into neural network model and target nerve network model to be compressed.
When specific implementation, neural network model to be compressed is the larger neural network model of volume, and has been led to
Cross what training data was trained, the neural network mould that neural network ensembles be made of single Neural or multiple are constituted
Type.For target nerve network model, with more parameter.Here parameter may include neural network
Feature extraction layer the number of plies and/or the parameter involved in every layer of feature extraction layer.
Therefore, it in order to compress neural network model to be compressed, needs training data being input to nerve to be compressed
In network model, the feature of training data is learnt using network model to be compressed, is realized to neural network mould to be compressed
The training of type obtains the neural network model to be compressed for completing training, and the neural network model to be compressed of completion training is made
To need the neural network model compressed.
Target nerve network model is then the good neural network model of advance framework, compared with neural network model to be compressed
With less parameter, for example, the number of plies with less feature extraction layer, simpler neural network structure, feature extraction layer
With the less parameter of quantity.
Herein, it should be noted that if the neural network model to be compressed is trained using unsupervised training method
It arrives, then training data is no label;If the neural network model to be compressed is to use to have the training method of supervision to obtain, instruct
Practicing data has label;If the neural network model to be compressed is obtained using transfer learning training method, training data
Can also be no label either there is label.
S102:The feature vector and classification results that training data is extracted based on neural network model to be compressed, to target
Neural network model is trained, the neural network model compressed;
When specific implementation, training data is inputted into neural network model and target nerve network mould to be compressed
Type, is used to classification results of the network model to be compressed to training data, the training of guiding target neural network model, and
During being trained to target nerve network model, it is allowed to the classification results to training data, being compressed with band as possible
Neural network model is close to the classification results of training data.
Neural network model compression method provided by the embodiments of the present application, compresses to neural network model to be compressed
When, the quantity of advance one parameter of framework of meeting is less than the target nerve net of the quantity of neural network model parameter to be compressed
Then training data is input in neural network model and target nerve network model to be compressed by network, be based on god to be compressed
The feature vector and classification results, guiding target neural network model extracted to training data through network model are trained, obtain
To compression neural network model, this process is operated on band compression neural network model, and finally obtained
Compression neural network model and neural network model to be compressed should be to the classification results of same training data it is identical, in turn
The loss of precision will not be caused during model compression, it is thus possible under the premise of ensureing precision, to the size of model
It is compressed, meets the dual requirements for precision and moulded dimension.
Specifically, network model to be compressed generally comprises:Neural network to be compressed and grader to be compressed.Target nerve net
Network model generally comprises:Target nerve network and object classifiers;The obtained compression neural network model of training includes:Compression god
Through network and compression grader.
Shown in Figure 2, the embodiment of the present application two also provides one kind based on neural network model to be compressed to training data
Classification results, to the specific method that target nerve network model is trained, including:
S201:First eigenvector is extracted using the training data that neural network to be compressed is input, and uses target god
Through the training data extraction second feature vector that network is input.
S202:Similarity mode is carried out to first eigenvector and second feature vector, and according to similarity mode
As a result epicycle training is carried out to target nerve network;
S203:First eigenvector is input to grader to be compressed, obtains the first classification results;
Second feature vector is input to object classifiers, obtains the second classification results;
S204:According to the comparison result of the first classification results and the second classification results, to target nerve network and target
Grader carries out epicycle training;
S205:By carrying out more wheel training to target nerve network and object classifiers, compression neural network mould is obtained
Type.
When specific implementation, in model compression process schematic shown in Figure 3, for convenience to the application reality
It applies example to be described, introduces two function modules, similarity mode module and comparing module in this embodiment.Wherein, similar
Matching module is spent to be used to first eigenvector and second feature vector carrying out similarity mode;Comparing module is used for first point
Class result and the second classification results are compared.
Training data is input to neural network model and target nerve network model to be compressed.Training data is input to
After neural network model to be compressed, two processes can be executed, first, neural network to be compressed can carry out feature to training data
Extraction, obtains the first eigenvector of training data;Then first eigenvector is transmitted to grader to be compressed, to be compressed point
Class device is based on first eigenvector, classifies to the training data of first eigenvector characterization, obtains the first classification results.
Similarly, training data is input to after target nerve network model, can also execute two processes, first, target
Neural network carries out feature extraction to training data, obtains the second feature vector of training data;Then by second feature vector
Object classifiers are transmitted to, object classifiers are based on second feature vector, are carried out to the training data of second feature vector characterization
Classification, obtains the second classification results.
To the process that neural network model to be compressed is compressed, actually realizes and pass through neural network mould to be compressed
The training of type guiding target neural network model so that the compression neural network model and neural network mould to be compressed that training obtains
The result that type classifies to same training data is consistent, that is, neural network to be compressed and compression neural network are to same instruction
When practicing data and carrying out feature extraction, similarity between obtained feature vector will be as close as;Meanwhile to be compressed point
Class device and compression grader be based respectively on as close possible to feature vector, when classifying to the training data that it is characterized, point
The result of class is consistent.Thus, it, be to target nerve network and mesh when being trained to target nerve network model
Mark grader is trained.
In the training process, the parameter of target nerve network can by first eigenvector and second feature vector it is similar
The influence for spending matching result, according to similarity mode as a result, the parameter of adjustment target nerve network.Due to target nerve network
With the difference of parameter in neural network to be compressed, first eigenvector is caused to be extremely difficult to second feature vector consistent.Thus
It needs as possible so that the second feature vector that extract to training data of target nerve network being approached to first eigenvector as possible;
Meanwhile the parameter of target nerve network also suffers from the of the training data classification that object classifiers characterize second feature vector
The influence of two classification results will adjust the ginseng of target nerve network in the second classification results and inconsistent the first classification results
Number so that the second classification results that object classifiers obtain are consistent with the first classification results.
The parameter of object classifiers can be influenced by the first classification results and the second classification results comparison result, first
When classification results and the second classification record a demerit inconsistent, the parameter of object classifiers is adjusted so that the second classification results and first point
Class result is consistent.
In turn, after training data to be inputted to neural network model and target nerve network model to be compressed, make first
First eigenvector is extracted with the training data that neural network to be compressed is data, and uses the instruction that target nerve network is input
Practice data extraction second feature vector, the first eigenvector of same training data and second feature vector are then transmitted to phase
Like degree matching module, similarity mode is carried out to first eigenvector and second feature vector using similarity mode module, and
Epicycle training is carried out to target nerve network according to similarity mode;Meanwhile first eigenvector is input to classification to be compressed
Device obtains the first classification results, and second feature vector is input to object classifiers, obtains the second classification results, then will
First classification results and the second classification results are transmitted to comparing module, are classified using the first classification results of comparing module pair and second
As a result it is compared, and according to comparison as a result, carrying out epicycle training to target nerve network and object classifiers.
By carrying out more wheel training to target nerve network and object classifiers, compression neural network model is obtained.
It is noted herein that epicycle training refers to being instructed to target nerve network model using same training data
Practice, until target nerve network carries out training data the second feature vector that feature extraction obtains, and is classified to obtain
The second classification results be satisfied by preset condition;More wheel training refer to, using multiple training datas to target nerve network into
Row training, each training data carry out a wheel to target nerve network and train.
Specifically, the embodiment of the present application three also provides a kind of comparison knot according to the first classification results and the second classification results
Fruit carries out target nerve network and object classifiers the specific method of epicycle training, including:It executes following first and compares behaviour
Make, until the Classification Loss of target nerve network model meets default loss range, completes to target nerve network and target
The epicycle of grader is trained.
Shown in Figure 4, first, which compares operation, includes:
S401:It compares the first classification results and whether the second classification results is consistent;If it is, jumping to S402;If
It is no, then jump to S403.
S402:It completes to train the epicycle of target nerve network and object classifiers;This flow terminates.
S403:Generate the first feedback information, and based on the first feedback information to target nerve network and object classifiers into
Row parameter adjustment;
S404:The use of target nerve network and object classifiers is that training data determines new based on the parameter after adjustment
The second classification results, and execute S401 again.
When specific implementation, it is ensured that target nerve network model after excessive training in rotation white silk, obtained compression is refreshing
Precision through network model will ensure to compress neural network model and neural network model to be compressed to same training data
Classification results are consistent.Therefore, for the first classification results and the second classification results are compared with comparing module.When than
When inconsistent to result, the first feedback information is generated, is based on the first feedback information, to target nerve network and object classifiers
Parameter is adjusted, and has been adjusted target nerve network and object classifiers after parameter;It reuses after having adjusted parameter
Target nerve network and object classifiers determine the second new classification results for training data, then based on the first classification results and newly
The second classification results carry out it is above-mentioned first compare operation, repeat the above process, until the first classification results and second classification knot
Fruit is consistent.
In addition, shown in Figure 5, the embodiment of the present application four also provide it is a kind of to first eigenvector and second feature to
Amount carries out similarity mode, and carries out the specific method of this training in rotation to target nerve network according to the result of similarity mode, packet
It includes:
S501:First eigenvector and second feature vector are clustered respectively;
S502:According to first eigenvector clustered as a result, generate the first adjacency matrix;According to second feature
It is that vector is clustered as a result, generate the second adjacency matrix;
S504:According to the similarity between the first adjacency matrix and the second adjacency matrix, to the parameter of target network into
Row epicycle is trained.
When specific implementation, first eigenvector can be regarded to the point being mapped in higher dimensional space as, according to point
The distance between point respectively clusters these points, distance is divided into the point within predetermined threshold value in same class, so
Afterwards according to cluster as a result, forming the first adjacency matrix of distance between point-to-point.
In the first adjacency matrix, if two points belong to same class in cluster, distance between the two is 1;Such as
Two points of fruit are not belonging to same class in cluster, then the distance between 2 points are 0.
For example, training data has 5, obtained first eigenvector is respectively:1、2、3、4、5.Wherein, to fisrt feature
The result that vector is clustered is:{ 1,3 }, { 2 }, { 4,5 }, the then adjacency matrix formed are:
According to second feature vector clustered as a result, to form the second adjacency matrix similar to the above, therefore no longer
It repeats.
The embodiment of the present application five also provides a kind of similarity according between the first adjacency matrix and the second adjacency matrix,
To the method that the parameter of target network carries out epicycle training, this method includes:It executes following similarity and determines operation, until first
Similarity between adjacency matrix and the second adjacency matrix is less than preset first similarity threshold, completes to target nerve network
Epicycle training;
Shown in Figure 6, similarity determines that operation includes:
S601:Compare whether the similarity between the first adjacency matrix and the second adjacency matrix is less than preset first phase
Like degree threshold value.If it is, executing S602;If it is not, then carrying out S603.
Herein, it when specific implementation, is calculating between currently available the first adjacency matrix and the second adjacency matrix
Similarity when, calculate the first adjacency matrix mark and the second adjacency matrix mark, the mark of the first adjacency matrix and second adjoining
The distance between mark of matrix is closer, then the similarity between the first adjacency matrix and the second adjacency matrix is higher.To first
It, can be by the mark of the first adjacency matrix and when the distance between the mark of the mark of adjacency matrix and the second adjacency matrix is solved
Difference between the mark of two adjacency matrix as between the first adjacency matrix and the second adjacency matrix similarity namely the first adjoining
Absolute value of the difference between the mark of matrix and the mark of the second adjacency matrix is bigger, the phase of the first adjacency matrix and the second adjacency matrix
It is lower like spending.
S602:It completes to train the epicycle of target nerve network.This flow terminates.
S603:The first feedback information is generated, and parameter adjustment is carried out to target nerve network based on the first feedback information;
S604:The use of target nerve network is that training data extracts new second feature vector based on the parameter after adjustment;
New second feature vector is clustered, generates the second new adjacency matrix, and execute S601 again.
When specific implementation, since the similarity between the first adjacency matrix and the second adjacency matrix is higher, then the
The classification results classified to first eigenvector and the second adjacency matrix of one adjacency matrix characterization characterize special to second
The classification results that sign vector is classified are more similar, therefore will be according to similar between the first adjacency matrix and the second adjacency matrix
Degree carries out parameter adjustment to target nerve network so that target nerve network is carrying out what feature extraction obtained to training data
Second feature vector is become closer in the first spy obtained to training data progress feature extraction using neural network to be compressed
Sign vector.
In addition, shown in Figure 7, it is special to first eigenvector and second that the embodiment of the present application five also provides another
Sign vector carries out similarity mode, and carries out the specific side of this training in rotation to target nerve network according to the result of similarity mode
Method, this method include:
S701:The operation for first eigenvector and second feature vector reduce dimension respectively, obtains fisrt feature
Second dimensionality reduction feature vector of the first dimensionality reduction feature vector and second feature vector of vector.
When specific implementation, to first eigenvector and second feature vector reduce the operation of dimension, it can be with
By carrying out recompiling acquisition to first eigenvector and second feature vector, such as using a full articulamentum, to first
Feature vector and second feature vector carry out a Feature capturing again, obtain the first dimensionality reduction feature vector and the second dimensionality reduction feature
Vector.
S702:Calculate the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector.
Herein, it when calculating the similarity between the first dimensionality reduction feature vector and the second dimensionality reduction feature vector, may be used
The difference between two vectors is calculated, and using the difference between two as the result of similarity.Alternatively, can be directly in the first drop
Dimensional feature vector and the second dimensionality reduction feature vector will carry out the result of subtraction as similar into the subtraction between row element and element
The result of degree;Alternatively, can also regard the first dimensionality reduction feature vector and the second dimensionality reduction feature vector a little as, corresponding sky is projected
Between in, calculate point distribution between difference.For example, the first dimensionality reduction feature vector and the second dimensionality reduction feature vector are projected correspondence
Space in, obtained point is respectively:S(X1, Y1, Z1), M (X2, Y2, Z2) by the distance between two points L=(X1-X2)2+
(Y1-Y2)2+(Z1-Z2)2Similarity as the two;Apart from smaller, similarity is bigger.
S703:According to the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector to the parameter of target network into
Row epicycle is trained.
Herein, according to the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector to the parameter of target network into
Row epicycle is trained, and is actually to ensure that the similarity between the first dimensionality reduction feature vector and the second dimensionality reduction feature vector default
The second similarity threshold within.Specifically
Following similarity can be executed and determine operation, until the phase of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector
It is less than preset second similarity threshold like degree, completes to train the epicycle of target nerve network.
Shown in Figure 8, similarity determines that operation includes:
S801:It is default whether the similarity between the first dimensionality reduction feature vector and the second dimensionality reduction feature vector is less than by comparison
The second similarity threshold;If it is, executing S302;If it is not, then executing S803.
Herein, the similarity calculating method between the first dimensionality reduction feature vector and the second dimensionality reduction feature vector may refer to
The description of S702 is stated, details are not described herein.
S802:It completes to train the epicycle of target nerve network.This flow terminates.
S803:The second feedback information is generated, and parameter adjustment is carried out to target nerve network based on the second feedback information.
S804:The use of target nerve network is that training data extracts new second feature vector based on the parameter after adjustment.
The operation for new second feature vector reduce dimension, generates the second new dimensionality reduction feature vector, and execute S801 again.
Specifically, it is ensured that first eigenvector and second feature vector are as close as the phase it is necessary to both make
It is less than certain threshold value like degree, namely the similarity between the first dimensionality reduction feature vector and the second dimensionality reduction feature vector is less than
Preset second similarity threshold.It is when similarity between the two is not less than preset second similarity threshold, i.e., corresponding
The second feedback information is generated, and carries out the adjustment of parameter to target nerve network based on second feedback information so that target god
Through network when being that training data extracts second feature vector again, it can be dropped towards the first dimensionality reduction feature vector and second is increased
The direction change of similarity between dimensional feature vector.Then it is training number to reuse the target nerve network after having adjusted parameter
It is vectorial according to new second feature is extracted, and reduction dimension operation is carried out to new second feature vector again, generate new second
Dimensionality reduction feature vector, and execute similarity calculation operation again, until the first dimensionality reduction feature vector and the second dimensionality reduction feature to
Similarity between amount is less than preset second similarity threshold.
The compression neural network model obtained using the embodiment of the present application one, it is ensured that compress the essence of neural network model
It spends and is consistent with the precision of neural network model to be compressed;Unsupervised learning or transfer learning training method are obtained
For network model to be compressed, if neural network model to be compressed is wrong to the classification of certain training data, in certain journey
It is wrong to the classification of the training data that compression neural network model is also resulted on degree.The embodiment of the present application six also provides separately
A kind of outer neural network model compression method can further increase the precision of compression neural network model.
It is shown in Figure 9, the embodiment of the present application six provide neural network model compression method, to first eigenvector with
And before second feature vector carries out similarity mode, further include:
S901:Noise addition operation is carried out to first eigenvector.
When specific implementation, noise addition is carried out to first eigenvector, is to increase the compression that training obtains
The generalization ability of neural network model.Generalization ability refers to the adaptability for referring to machine learning algorithm to fresh sample.To
When one feature vector carries out noise addition operation, the multiple different degrees of noise of degree can be carried out to first eigenvector,
Or the repeatedly addition of variety classes noise.The addition of noise each time can all generate fisrt feature that one is added to noise to
Amount, each is added to the first eigenvector of noise, can all cause to a degree of offset of original first eigenvector, from
And a training data is enable to obtain the first eigenvector of a variety of offsets.Meanwhile it can also enrich first eigenvector
Data volume reduces the training data of input in the case where first eigenvector data volume is constant, data can be allowed preferably to intend
It closes.In addition, neural network model to be compressed for certain training datas classification not necessarily very accurately, therefore, increase
The mutation of first eigenvector may to be added to the first eigenvector after noise and more be intended to reality, realize to mesh
The training of mark neural network model is preferably guided.
It is general to use construction with first eigenvector with identical when carrying out noise addition to first eigenvector
Noise is added to by the noise vector of dimension in such a way that first eigenvector is added with noise vector corresponding position data
In first eigenvector.
When construction has the noise vector of identical dimensional with first eigenvector, can directly construct, it can also be indirect
Construction.Directly construction refer to directly generate with and noise vector of the first eigenvector with identical dimensional, for example, ought the
When the dimension of one feature vector is 1 × 1000, the noise vector of construction is also 1 × 1000 dimension.Indirect configuration refers to generating dimension
Less than the noise vector of first eigenvector, then by the way of to noise vector zero filling, generate dimension and fisrt feature to
Measure identical noise vector;For example, when the dimension of first eigenvector is 1 × 1000, the intermediate noise vector of construction is 1 ×
500;0 is filled out in any position of intermediate noise vector, ultimately forms the noise vector that dimension is 1 × 1000.
In addition, since repeatedly different degrees of noise can be carried out to first eigenvector, or repeatedly variety classes are made an uproar
The addition of sound, the different noise of degree, the mode that parameter in change noise generation algorithm may be used obtain;Or using indirect
The method for constructing noise vector is obtained in the method that different location fills out 0;Different types of noise, can be by changing noise life
It is obtained at the mode of algorithm.
S902:The first eigenvector for being added to noise and second feature vector are subjected to similarity mode.
The method for being added to the first eigenvector and second feature vector progress similarity mode of noise, makes an uproar with being not added with
The first eigenvector of sound is similar with the second feature vector progress method of similarity mode, specifically may refer to above-mentioned statement,
Details are not described herein.
In addition, in this embodiment, due to being added to noise to first eigenvector, using grader to be compressed to addition
When first eigenvector after noise is classified, the classification of classification results and original first eigenvector may result in
As a result different, if not being modified to it, the precision of finally obtained compression neural network model can be caused to be affected.
Therefore, in the embodiment of the present application, carried out in the first eigenvector and second feature vector that will be added to noise
Similarity mode will also execute following second while completion based on similarity mode result to the training of target nerve network
Operation is compared, until the first classification results are consistent with the label of training data, is completed to neural network to be compressed and to be compressed
The epicycle of grader is trained;
Second, which compares operation, includes:
The label of first classification results and training data is compared;
For the inconsistent situation of comparison result, scattered feedback information is generated, and based on scattered feedback information to be compressed
Neural network and grader to be compressed carry out parameter adjustment;
The use of neural network to be compressed and grader to be compressed is that training data extracts newly based on the parameter after adjustment
First classification results, and execute second again and compare operation.
The fine tuning to neural network model to be compressed may be implemented through the above steps, make neural network model to be compressed with
And the compression neural network model that training obtains can have better generalization ability and higher precision.
Based on same inventive concept, god corresponding with neural network model compression method is additionally provided in the embodiment of the present invention
Through network model compression set, the principle solved the problems, such as due to the device in the embodiment of the present invention and the above-mentioned god of the embodiment of the present invention
It is similar through network model compression method, therefore the implementation of device may refer to the implementation of method, overlaps will not be repeated.
Neural network model compression set shown in Figure 10, that the embodiment of the present invention seven provides, specifically includes:
Input module 11, for training data to be inputted neural network model and target nerve network model to be compressed;
First training module 12, feature vector for extract to training data based on neural network model to be compressed and is divided
Class obtains compression neural network model as a result, be trained to target nerve network model;
Wherein, the quantity of target nerve network model parameter is less than the quantity of neural network model parameter to be compressed.
Neural network model compression set provided by the embodiments of the present application, compresses to neural network model to be compressed
When, the quantity of advance one parameter of framework of meeting is less than the target nerve net of the quantity of neural network model parameter to be compressed
Then training data is input in neural network model and target nerve network model to be compressed by network, be based on god to be compressed
The feature vector and classification results, guiding target neural network model extracted to training data through network model are trained, obtain
To compression neural network model, this process is operated on band compression neural network model, and finally obtained
Compression neural network model and neural network model to be compressed should be to the classification results of same training data it is identical, in turn
The loss of precision will not be caused during model compression, it is thus possible under the premise of ensureing precision, to the size of model
It is compressed, meets the dual requirements for precision and moulded dimension.
Optionally, further include:Second training module 13, for by training data input neural network model to be compressed with
And before target nerve network model, training data is inputted into neural network model to be compressed, to neural network model to be compressed
It is trained, obtains the neural network model to be compressed for completing training.
Optionally, neural network model to be compressed includes:Neural network to be compressed and grader to be compressed;Target nerve net
Network model includes:Target nerve network and object classifiers;
First training module 12, is specifically used for:It is special using the training data extraction first that neural network to be compressed is input
Sign vector, and use the training data extraction second feature vector that target nerve network is input;
Similarity mode is carried out to first eigenvector and second feature vector, and according to the result pair of similarity mode
Target nerve network carries out epicycle training;And
First eigenvector is input to grader to be compressed, obtains the first classification results;
Second feature vector is input to object classifiers, obtains the second classification results;
According to the comparison result of the first classification results and the second classification results, to target nerve network and object classifiers
Carry out epicycle training;
By carrying out more wheel training to target nerve network and object classifiers, compression neural network model is obtained.
Optionally, the first training module 12 is specifically used for comparing operation by executing following first, until target nerve net
The Classification Loss of network model meets default loss range, completes to train the epicycle of target nerve network and object classifiers;
First, which compares operation, includes:
First classification results and the second classification results are compared;
For the inconsistent situation of comparison result, the first feedback information is generated, and based on the first feedback information to target god
Parameter adjustment is carried out through network and object classifiers;
The use of target nerve network and object classifiers is that training data determines new second based on the parameter after adjustment
Classification results, and execute first again and compare operation.
Optionally, the first training module 12 is additionally operable to:Similarity is carried out to first eigenvector and second feature vector
Before matching, noise addition operation is carried out to first eigenvector;The first eigenvector and second feature of noise will be added to
Vector carries out similarity mode.
Optionally, the first training module 12 be specifically used for by describe step to first eigenvector and second feature to
Amount carries out similarity mode, and carries out epicycle training to target nerve network according to the result of similarity mode:Respectively to first
Feature vector and second feature vector are clustered;
According to first eigenvector clustered as a result, generate the first adjacency matrix;
According to second feature vector clustered as a result, generate the second adjacency matrix;
According to the similarity between the first adjacency matrix and the second adjacency matrix, epicycle is carried out to the parameter of target network
Training.
Optionally, the first training module 12 is specifically used for determining operation by executing following similarity, until the first adjoining
Similarity between matrix and the second adjacency matrix is less than preset first similarity threshold, completes the sheet to target nerve network
Wheel training;
Similarity determines that operation includes:
Calculate the similarity between currently available the first adjacency matrix and the second adjacency matrix;
The case where being not less than preset first similarity threshold for similarity, generates the first feedback information, and based on the
One feedback information carries out parameter adjustment to target nerve network;
The use of target nerve network is that training data extracts new second feature vector based on the parameter after adjustment;
New second feature vector is clustered, generates the second new adjacency matrix, and execute similarity calculation again
Operation.
Optionally, the first training module 12 be specifically used for by following step be first eigenvector and second feature to
Amount carries out similarity mode, and carries out epicycle training to target nerve network according to the result of similarity mode:
The operation for first eigenvector and second feature vector reduce dimension respectively, obtains first eigenvector
Second dimensionality reduction feature vector of the first dimensionality reduction feature vector and second feature vector;
Calculate the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector;
This is carried out to the parameter of target network according to the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector
Wheel training.
Optionally, the first training module 12 is specifically used for executing following similarity determination operation, until the first dimensionality reduction feature
The similarity of vector sum the second dimensionality reduction feature vector is less than preset second similarity threshold, completes the sheet to target nerve network
Wheel training;
Similarity determines that operation includes:
Calculate the similarity between currently available the first dimensionality reduction feature vector and the second dimensionality reduction feature vector;
The case where being not less than preset second similarity threshold for similarity, generates the second feedback information, and based on the
Two feedback informations carry out parameter adjustment to target nerve network;
The use of target nerve network is that training data extracts new second feature vector based on the parameter after adjustment;
The operation for new second feature vector reduce dimension, generates the second new dimensionality reduction feature vector, and again
Execute similarity calculation operation.
Corresponding to the neural network model compression method in Fig. 1, the embodiment of the present invention eight additionally provides a kind of computer and sets
Standby, as shown in figure 11, which includes memory 1000, processor 2000 and is stored on the memory 1000 and can be at this
The computer program run on reason device 2000, wherein above-mentioned processor 2000 realizes above-mentioned god when executing above computer program
The step of through network model compression method.
Specifically, above-mentioned memory 1000 and processor 2000 can be general memory and processor, not do here
It is specific to limit, when the computer program of 2000 run memory 1000 of processor storage, it is able to carry out above-mentioned neural network mould
Type compression method carries out model compression to solve existing model compression method premised on the precision for sacrificing model, can not
The problem of reaching to precision use demand, and then reach in the case where ensureing neural network model precision, to the size of model
The effect compressed.
Corresponding to the neural network model compression method in Fig. 1, the embodiment of the present invention nine additionally provides a kind of computer can
Storage medium is read, computer program is stored on the computer readable storage medium, when which is run by processor
The step of executing above-mentioned neural network model compression method.
Specifically, which can be general storage medium, such as mobile disk, hard disk, on the storage medium
Computer program when being run, above-mentioned neural network model compression method is able to carry out, to solve existing model compression
Method carries out model compression premised on the precision for sacrificing model, the problem of being unable to reach to precision use demand, and then reaches
In the case where ensureing neural network model precision, effect that the size of model is compressed.
The computer program product of neural network model compression method and device that the embodiment of the present invention is provided, including
The computer readable storage medium of program code is stored, the instruction that program code includes can be used for executing previous methods embodiment
In method, specific implementation can be found in embodiment of the method, details are not described herein.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description
It with the specific work process of device, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
If function is realized in the form of SFU software functional unit and when sold or used as an independent product, can store
In a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially in other words to existing
There is the part for the part or the technical solution that technology contributes that can be expressed in the form of software products, the computer
Software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be personal meter
Calculation machine, server or network equipment etc.) execute all or part of step of each embodiment method of the present invention.And it is above-mentioned
Storage medium includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory
The various media that can store program code such as (RAM, Random Access Memory), magnetic disc or CD.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain
Lid is within protection scope of the present invention.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (10)
1. a kind of neural network model compression method, which is characterized in that this method includes:
Training data is inputted into neural network model and target nerve network model to be compressed;
The feature vector and classification results that the training data is extracted based on the neural network model to be compressed, to target god
It is trained through network model, obtains compression neural network model;
Wherein, the quantity of the target nerve network model parameter is less than the quantity of the neural network model parameter to be compressed.
2. according to the method described in claim 1, it is characterized in that, by training data input neural network model to be compressed with
And before target nerve network model, further include:
The training data is inputted into the neural network model to be compressed, the neural network model to be compressed is instructed
Practice, obtains the neural network model to be compressed for completing training.
3. according to the method described in claim 1, it is characterized in that, the neural network model to be compressed includes:God to be compressed
Through network and grader to be compressed;The target nerve network model includes:Target nerve network and object classifiers;
The feature vector that the training data is extracted based on the neural network model to be compressed and classification results, to mesh
Mark neural network model is trained, and is obtained compression neural network model, is specifically included:
First eigenvector is extracted using the training data that the neural network to be compressed is input, and uses the target nerve
Network is the training data extraction second feature vector of input;
Similarity mode is carried out to the first eigenvector and second feature vector, and according to the result pair of similarity mode
The target nerve network carries out epicycle training;And
The first eigenvector is input to the grader to be compressed, obtains the first classification results;
The second feature vector is input to the object classifiers, obtains the second classification results;
According to the comparison result of first classification results and second classification results, to the target nerve network and institute
It states object classifiers and carries out epicycle training;
By carrying out more wheel training to the target nerve network and the object classifiers, compression neural network mould is obtained
Type.
4. according to the method described in claim 3, it is characterized in that, executing following first compares operation, until target god
Classification Loss through network model meets default loss range, completes to the target nerve network and the object classifiers
Epicycle training;
Described first, which compares operation, includes:
First classification results and second classification results are compared;
For the inconsistent situation of comparison result, the first feedback information is generated, and based on first feedback information to the mesh
It marks neural network and the object classifiers carries out parameter adjustment;
The use of target nerve network and object classifiers is that the training data determines new second based on the parameter after adjustment
Classification results, and execute described first again and compare operation.
5. according to the method described in claim 3, it is characterized in that, described to the first eigenvector and described second special
Before sign vector carries out similarity mode, further include:
Noise addition operation is carried out to the first eigenvector;
It is described that similarity mode is carried out to the first eigenvector and the second feature vector, it specifically includes:
The first eigenvector for being added to noise and the second feature vector are subjected to similarity mode.
6. according to the method described in claim 3-5 any one, which is characterized in that it is described to the first eigenvector and
Second feature vector carries out similarity mode, and carries out this training in rotation to the target nerve network according to the result of similarity mode
Practice, specifically includes:
The first eigenvector and the second feature vector are clustered respectively;
According to the first eigenvector clustered as a result, generate the first adjacency matrix;
According to the second feature vector clustered as a result, generate the second adjacency matrix;
According to the similarity between first adjacency matrix and the second adjacency matrix, the parameter of the target network is carried out
Epicycle is trained.
7. according to the method described in claim 6, it is characterized in that, executing following similarity determines operation, until the first adjoining
Similarity between matrix and the second adjacency matrix is less than preset first similarity threshold, completes to the target nerve network
Epicycle training;
The similarity determines that operation includes:
Calculate the similarity between currently available the first adjacency matrix and the second adjacency matrix;
The case where being not less than preset first similarity threshold for similarity, generates the first feedback information, and based on described the
One feedback information carries out parameter adjustment to the target nerve network;
The use of target nerve network is that the training data extracts new second feature vector based on the parameter after adjustment;
New second feature vector is clustered, generates the second new adjacency matrix, and execute the similarity calculation again
Operation.
8. according to the method described in claim 3-5 any one, which is characterized in that it is described to the first eigenvector and
Second feature vector carries out similarity mode, and carries out this training in rotation to the target nerve network according to the result of similarity mode
Practice, specifically includes:
The operation for first eigenvector and second feature vector reduce dimension respectively, obtains the first of first eigenvector
Second dimensionality reduction feature vector of dimensionality reduction feature vector and second feature vector;
Calculate the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector;
According to the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector to the ginseng of the target network
Number carries out epicycle training.
9. according to the method described in claim 8, it is characterized in that, executing following similarity determines operation, until the first dimensionality reduction
The similarity of feature vector and the second dimensionality reduction feature vector is less than preset second similarity threshold, completes to the target
The epicycle of neural network is trained;
The similarity determines that operation includes:
Calculate the similarity between currently available the first dimensionality reduction feature vector and the second dimensionality reduction feature vector;
The case where being not less than preset second similarity threshold for similarity, generates the second feedback information, and based on described the
Two feedback informations carry out parameter adjustment to the target nerve network;
The use of target nerve network is that the training data extracts new second feature vector based on the parameter after adjustment;
The operation for new second feature vector reduce dimension, generates the second new dimensionality reduction feature vector, and execute again
The similarity calculation operation.
10. a kind of neural network model compression set, which is characterized in that the device includes:
Input module, for training data to be inputted neural network model and target nerve network model to be compressed;
Training module, the feature vector for being extracted to the training data based on the neural network model to be compressed and classification
As a result, being trained to target nerve network model, compression neural network model is obtained;
Wherein, the quantity of the target nerve network model parameter is less than the quantity of the neural network model parameter to be compressed.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810274146.3A CN108510083B (en) | 2018-03-29 | 2018-03-29 | Neural network model compression method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810274146.3A CN108510083B (en) | 2018-03-29 | 2018-03-29 | Neural network model compression method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108510083A true CN108510083A (en) | 2018-09-07 |
CN108510083B CN108510083B (en) | 2021-05-14 |
Family
ID=63379557
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810274146.3A Active CN108510083B (en) | 2018-03-29 | 2018-03-29 | Neural network model compression method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108510083B (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110008880A (en) * | 2019-03-27 | 2019-07-12 | 深圳前海微众银行股份有限公司 | A kind of model compression method and device |
CN110163236A (en) * | 2018-10-15 | 2019-08-23 | 腾讯科技(深圳)有限公司 | The training method and device of model, storage medium, electronic device |
CN110211121A (en) * | 2019-06-10 | 2019-09-06 | 北京百度网讯科技有限公司 | Method and apparatus for pushing model |
US20200019840A1 (en) * | 2018-07-13 | 2020-01-16 | Arizona Board Of Regents On Behalf Of Arizona State University | Systems and methods for sequential event prediction with noise-contrastive estimation for marked temporal point process |
CN110929839A (en) * | 2018-09-20 | 2020-03-27 | 深圳市商汤科技有限公司 | Method and apparatus for training neural network, electronic device, and computer storage medium |
WO2020108368A1 (en) * | 2018-11-29 | 2020-06-04 | 华为技术有限公司 | Neural network model training method and electronic device |
CN111242273A (en) * | 2018-11-29 | 2020-06-05 | 华为终端有限公司 | Neural network model training method and electronic equipment |
WO2020231049A1 (en) | 2019-05-16 | 2020-11-19 | Samsung Electronics Co., Ltd. | Neural network model apparatus and compressing method of neural network model |
CN112020724A (en) * | 2019-04-01 | 2020-12-01 | 谷歌有限责任公司 | Learning compressible features |
CN112288032A (en) * | 2020-11-18 | 2021-01-29 | 上海依图网络科技有限公司 | Method and device for quantitative model training based on generation of confrontation network |
CN113505774A (en) * | 2021-07-14 | 2021-10-15 | 青岛全掌柜科技有限公司 | Novel policy identification model size compression method |
WO2022027937A1 (en) * | 2020-08-06 | 2022-02-10 | 苏州浪潮智能科技有限公司 | Neural network compression method, apparatus and device, and storage medium |
WO2022062828A1 (en) * | 2020-09-23 | 2022-03-31 | 深圳云天励飞技术股份有限公司 | Image model training method, image processing method, chip, device and medium |
EP3935578A4 (en) * | 2019-05-16 | 2022-06-01 | Samsung Electronics Co., Ltd. | Neural network model apparatus and compressing method of neural network model |
CN115526266A (en) * | 2022-10-18 | 2022-12-27 | 支付宝(杭州)信息技术有限公司 | Model training method and device, and business prediction method and device |
CN118015343A (en) * | 2024-01-18 | 2024-05-10 | 中移信息系统集成有限公司 | Image filtering method and device and electronic equipment |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104331738A (en) * | 2014-10-21 | 2015-02-04 | 西安电子科技大学 | Network reconfiguration algorithm based on game theory and genetic algorithm |
CN104661037A (en) * | 2013-11-19 | 2015-05-27 | 中国科学院深圳先进技术研究院 | Tampering detection method and system for compressed image quantization table |
WO2015089148A2 (en) * | 2013-12-13 | 2015-06-18 | Amazon Technologies, Inc. | Reducing dynamic range of low-rank decomposition matrices |
US20150178620A1 (en) * | 2011-07-07 | 2015-06-25 | Toyota Motor Europe Nv/Sa | Artificial memory system and method for use with a computational machine for interacting with dynamic behaviours |
CN106096670A (en) * | 2016-06-17 | 2016-11-09 | 北京市商汤科技开发有限公司 | Concatenated convolutional neural metwork training and image detecting method, Apparatus and system |
CN106251347A (en) * | 2016-07-27 | 2016-12-21 | 广东工业大学 | subway foreign matter detecting method, device, equipment and subway shield door system |
CN106503799A (en) * | 2016-10-11 | 2017-03-15 | 天津大学 | Deep learning model and the application in brain status monitoring based on multiple dimensioned network |
EP3168781A1 (en) * | 2015-11-16 | 2017-05-17 | Samsung Electronics Co., Ltd. | Method and apparatus for recognizing object, and method and apparatus for training recognition model |
CN106778684A (en) * | 2017-01-12 | 2017-05-31 | 易视腾科技股份有限公司 | deep neural network training method and face identification method |
CN106845381A (en) * | 2017-01-16 | 2017-06-13 | 西北工业大学 | Sky based on binary channels convolutional neural networks composes united hyperspectral image classification method |
US20170249536A1 (en) * | 2016-02-29 | 2017-08-31 | Christopher J. Hillar | Self-Organizing Discrete Recurrent Network Digital Image Codec |
US20170337711A1 (en) * | 2011-03-29 | 2017-11-23 | Lyrical Labs Video Compression Technology, LLC | Video processing and encoding |
-
2018
- 2018-03-29 CN CN201810274146.3A patent/CN108510083B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170337711A1 (en) * | 2011-03-29 | 2017-11-23 | Lyrical Labs Video Compression Technology, LLC | Video processing and encoding |
US20150178620A1 (en) * | 2011-07-07 | 2015-06-25 | Toyota Motor Europe Nv/Sa | Artificial memory system and method for use with a computational machine for interacting with dynamic behaviours |
CN104661037A (en) * | 2013-11-19 | 2015-05-27 | 中国科学院深圳先进技术研究院 | Tampering detection method and system for compressed image quantization table |
WO2015089148A2 (en) * | 2013-12-13 | 2015-06-18 | Amazon Technologies, Inc. | Reducing dynamic range of low-rank decomposition matrices |
CN104331738A (en) * | 2014-10-21 | 2015-02-04 | 西安电子科技大学 | Network reconfiguration algorithm based on game theory and genetic algorithm |
EP3168781A1 (en) * | 2015-11-16 | 2017-05-17 | Samsung Electronics Co., Ltd. | Method and apparatus for recognizing object, and method and apparatus for training recognition model |
US20170249536A1 (en) * | 2016-02-29 | 2017-08-31 | Christopher J. Hillar | Self-Organizing Discrete Recurrent Network Digital Image Codec |
CN106096670A (en) * | 2016-06-17 | 2016-11-09 | 北京市商汤科技开发有限公司 | Concatenated convolutional neural metwork training and image detecting method, Apparatus and system |
CN106251347A (en) * | 2016-07-27 | 2016-12-21 | 广东工业大学 | subway foreign matter detecting method, device, equipment and subway shield door system |
CN106503799A (en) * | 2016-10-11 | 2017-03-15 | 天津大学 | Deep learning model and the application in brain status monitoring based on multiple dimensioned network |
CN106778684A (en) * | 2017-01-12 | 2017-05-31 | 易视腾科技股份有限公司 | deep neural network training method and face identification method |
CN106845381A (en) * | 2017-01-16 | 2017-06-13 | 西北工业大学 | Sky based on binary channels convolutional neural networks composes united hyperspectral image classification method |
Non-Patent Citations (2)
Title |
---|
JIAN-HAO LUO等: "ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression", 《COMPUTER VISION FOUNDATION》 * |
王征韬: "深度神经网络压缩与优化研究", 《中国优秀硕士学位论文全文数据库(信息科技辑)》 * |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200019840A1 (en) * | 2018-07-13 | 2020-01-16 | Arizona Board Of Regents On Behalf Of Arizona State University | Systems and methods for sequential event prediction with noise-contrastive estimation for marked temporal point process |
US12014267B2 (en) * | 2018-07-13 | 2024-06-18 | Arizona Board Of Regents On Behalf Of Arizona State University | Systems and methods for sequential event prediction with noise-contrastive estimation for marked temporal point process |
CN110929839B (en) * | 2018-09-20 | 2024-04-16 | 深圳市商汤科技有限公司 | Method and device for training neural network, electronic equipment and computer storage medium |
CN110929839A (en) * | 2018-09-20 | 2020-03-27 | 深圳市商汤科技有限公司 | Method and apparatus for training neural network, electronic device, and computer storage medium |
CN110163236B (en) * | 2018-10-15 | 2023-08-29 | 腾讯科技(深圳)有限公司 | Model training method and device, storage medium and electronic device |
CN110163236A (en) * | 2018-10-15 | 2019-08-23 | 腾讯科技(深圳)有限公司 | The training method and device of model, storage medium, electronic device |
CN111242273B (en) * | 2018-11-29 | 2024-04-12 | 华为终端有限公司 | Neural network model training method and electronic equipment |
WO2020108368A1 (en) * | 2018-11-29 | 2020-06-04 | 华为技术有限公司 | Neural network model training method and electronic device |
CN111242273A (en) * | 2018-11-29 | 2020-06-05 | 华为终端有限公司 | Neural network model training method and electronic equipment |
CN110008880A (en) * | 2019-03-27 | 2019-07-12 | 深圳前海微众银行股份有限公司 | A kind of model compression method and device |
CN110008880B (en) * | 2019-03-27 | 2023-09-29 | 深圳前海微众银行股份有限公司 | Model compression method and device |
CN112020724A (en) * | 2019-04-01 | 2020-12-01 | 谷歌有限责任公司 | Learning compressible features |
US12033077B2 (en) | 2019-04-01 | 2024-07-09 | Google Llc | Learning compressible features |
EP3935578A4 (en) * | 2019-05-16 | 2022-06-01 | Samsung Electronics Co., Ltd. | Neural network model apparatus and compressing method of neural network model |
US11657284B2 (en) | 2019-05-16 | 2023-05-23 | Samsung Electronics Co., Ltd. | Neural network model apparatus and compressing method of neural network model |
WO2020231049A1 (en) | 2019-05-16 | 2020-11-19 | Samsung Electronics Co., Ltd. | Neural network model apparatus and compressing method of neural network model |
CN110211121B (en) * | 2019-06-10 | 2021-07-16 | 北京百度网讯科技有限公司 | Method and device for pushing model |
CN110211121A (en) * | 2019-06-10 | 2019-09-06 | 北京百度网讯科技有限公司 | Method and apparatus for pushing model |
WO2022027937A1 (en) * | 2020-08-06 | 2022-02-10 | 苏州浪潮智能科技有限公司 | Neural network compression method, apparatus and device, and storage medium |
US12045729B2 (en) | 2020-08-06 | 2024-07-23 | Inspur Suzhou Intelligent Technology Co., Ltd. | Neural network compression method, apparatus and device, and storage medium |
WO2022062828A1 (en) * | 2020-09-23 | 2022-03-31 | 深圳云天励飞技术股份有限公司 | Image model training method, image processing method, chip, device and medium |
CN112288032B (en) * | 2020-11-18 | 2022-01-14 | 上海依图网络科技有限公司 | Method and device for quantitative model training based on generation of confrontation network |
CN112288032A (en) * | 2020-11-18 | 2021-01-29 | 上海依图网络科技有限公司 | Method and device for quantitative model training based on generation of confrontation network |
CN113505774B (en) * | 2021-07-14 | 2023-11-10 | 众淼创新科技(青岛)股份有限公司 | Policy identification model size compression method |
CN113505774A (en) * | 2021-07-14 | 2021-10-15 | 青岛全掌柜科技有限公司 | Novel policy identification model size compression method |
CN115526266B (en) * | 2022-10-18 | 2023-08-29 | 支付宝(杭州)信息技术有限公司 | Model Training Method and Device, Service Prediction Method and Device |
CN115526266A (en) * | 2022-10-18 | 2022-12-27 | 支付宝(杭州)信息技术有限公司 | Model training method and device, and business prediction method and device |
CN118015343A (en) * | 2024-01-18 | 2024-05-10 | 中移信息系统集成有限公司 | Image filtering method and device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN108510083B (en) | 2021-05-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108510083A (en) | A kind of neural network model compression method and device | |
Qin et al. | Forward and backward information retention for accurate binary neural networks | |
WO2021164625A1 (en) | Method of training an image classification model | |
Feng et al. | Evolutionary fuzzy particle swarm optimization vector quantization learning scheme in image compression | |
Lazebnik et al. | Supervised learning of quantizer codebooks by information loss minimization | |
CN108108751B (en) | Scene recognition method based on convolution multi-feature and deep random forest | |
CN113707235A (en) | Method, device and equipment for predicting properties of small drug molecules based on self-supervision learning | |
US20200257970A1 (en) | Data processing apparatus by learning of neural network, data processing method by learning of neural network, and recording medium recording the data processing method | |
WO2022179533A1 (en) | Quantum convolution operator | |
Carreira-Perpinán et al. | Model compression as constrained optimization, with application to neural nets. Part II: Quantization | |
CN107251059A (en) | Sparse reasoning module for deep learning | |
CN110728295B (en) | Semi-supervised landform classification model training and landform graph construction method | |
CN102331992A (en) | Distributed decision tree training | |
CN110969086A (en) | Handwritten image recognition method based on multi-scale CNN (CNN) features and quantum flora optimization KELM | |
CN112115967A (en) | Image increment learning method based on data protection | |
CN108647571A (en) | Video actions disaggregated model training method, device and video actions sorting technique | |
CN108334910A (en) | A kind of event detection model training method and event detecting method | |
CN113764034A (en) | Method, device, equipment and medium for predicting potential BGC in genome sequence | |
Sepahvand et al. | An adaptive teacher–student learning algorithm with decomposed knowledge distillation for on-edge intelligence | |
CN116629123A (en) | Pairing-based single-cell multi-group data integration method and system | |
JP7427011B2 (en) | Responding to cognitive queries from sensor input signals | |
KR102240882B1 (en) | Apparatus, method for generating classifier and classifying apparatus generated thereby | |
US20230020112A1 (en) | Relating complex data | |
Tang et al. | Bringing giant neural networks down to earth with unlabeled data | |
CN104933438A (en) | Image clustering method based on self-coding neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 100070, No. 101-8, building 1, 31, zone 188, South Fourth Ring Road, Beijing, Fengtai District Applicant after: Guoxin Youyi Data Co., Ltd Address before: 100070, No. 188, building 31, headquarters square, South Fourth Ring Road West, Fengtai District, Beijing Applicant before: SIC YOUE DATA Co.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |