Nothing Special   »   [go: up one dir, main page]

CN108510083A - A kind of neural network model compression method and device - Google Patents

A kind of neural network model compression method and device Download PDF

Info

Publication number
CN108510083A
CN108510083A CN201810274146.3A CN201810274146A CN108510083A CN 108510083 A CN108510083 A CN 108510083A CN 201810274146 A CN201810274146 A CN 201810274146A CN 108510083 A CN108510083 A CN 108510083A
Authority
CN
China
Prior art keywords
network model
neural network
feature vector
similarity
compressed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810274146.3A
Other languages
Chinese (zh)
Other versions
CN108510083B (en
Inventor
孙源良
王亚松
刘萌
樊雨茂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guoxin Youe Data Co Ltd
Original Assignee
Guoxin Youe Data Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guoxin Youe Data Co Ltd filed Critical Guoxin Youe Data Co Ltd
Priority to CN201810274146.3A priority Critical patent/CN108510083B/en
Publication of CN108510083A publication Critical patent/CN108510083A/en
Application granted granted Critical
Publication of CN108510083B publication Critical patent/CN108510083B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of neural network model compression method and devices, wherein this method includes:Training data is inputted into neural network model and target nerve network model to be compressed;The feature vector and classification results extracted to training data based on neural network model to be compressed, are trained target nerve network model, obtain compression neural network model;Wherein, the quantity of target nerve network model parameter is less than the quantity of neural network model parameter to be compressed.The feature vector and classification results that the embodiment of the present invention extracts training data based on neural network model to be compressed, guiding target neural network model is trained, finally obtained compression neural network model and neural network model to be compressed are identical to the classification results of same training data, into the loss without causing precision during model compression, it can be under the premise of ensureing precision, the size of model is compressed, the dual requirements for precision and moulded dimension are met.

Description

A kind of neural network model compression method and device
Technical field
The present invention relates to machine learning techniques field, in particular to a kind of neural network model compression method and Device.
Background technology
Fast development with neural network in fields such as image, voice, texts has pushed a series of falling for intellectual products Ground.In order to allow the feature of the more preferable learning training data of neural network with lift scheme effect, mutually it is applied to indicate neural network mould The parameter of type increases rapidly, and the number of plies of neural network is continuously increased, and leads to deep neural network model there is parameters numerous, mould Type training and the computationally intensive deficiency of application process;This causes the product based on neural network to rely on server end operation energy mostly The driving of power causes the application range of neural network model to be restricted highly dependent upon good running environment and network environment, Such as it cannot achieve Embedded Application.In order to realize the Embedded Application of neural network model, need neural network model Below volume compression to a certain range.
Current model compression method generally comprises following several:First, beta pruning, namely after the complete large-sized model of training, go The parameter for falling weight very little in network model then proceedes to be trained model;Second, reaching reduction ginseng by the way that weights are shared The purpose of number quantity;Third, quantization, it is however generally that, the floating type number for the 32bit length that the parameter of neural network model is all It indicates, need not actually retain so high precision, can indicate original 32 bit institutes by quantization, such as with 0~255 The precision of expression reduces the space occupied required for each weights by sacrificing precision.Fourth, neural network binaryzation, Also the parameter of network model is used into binary number representation, to achieve the purpose that reduce model size.
But above-mentioned several method is all that model compression is directly directly carried out on model to be compressed, and to sacrifice model Model compression is carried out premised on precision, is often unable to reach the use demand to precision.
Invention content
In view of this, the embodiment of the present invention is designed to provide a kind of neural network model compression method and device, The size of model can be compressed in the case where ensureing neural network model precision.
In a first aspect, an embodiment of the present invention provides a kind of neural network model compression method, this method includes:
Training data is inputted into neural network model and target nerve network model to be compressed;
The feature vector and classification results that the training data is extracted based on the neural network model to be compressed, to mesh Mark neural network model is trained, and obtains compression neural network model;
Wherein, the quantity of the target nerve network model parameter is less than the number of the neural network model parameter to be compressed Amount.
Second aspect, the embodiment of the present invention also provide a kind of neural network model compression set, which includes:
Input module, for training data to be inputted neural network model and target nerve network model to be compressed;
Training module, feature vector for extract to the training data based on the neural network model to be compressed with Classification results are trained target nerve network model, obtain compression neural network model;
Wherein, the quantity of the target nerve network model parameter is less than the number of the neural network model parameter to be compressed Amount.
Neural network model compression method and device provided by the embodiments of the present application, to neural network model to be compressed When compression, the quantity of advance one parameter of framework of meeting is less than the target of the quantity of neural network model parameter to be compressed Then training data is input in neural network model and target nerve network model to be compressed by neural network, based on waiting for The feature vector and classification results, guiding target neural network model that compression neural network model extracts training data are instructed Practice, obtains compression neural network model, finally obtained compression neural network model and neural network model to be compressed are to same The classification results of training data should be identical, and then the loss of precision will not be caused during model compression, thus The size of model can be compressed under the premise of ensureing precision, meets the dual requirements for precision and moulded dimension.
To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment cited below particularly, and coordinate Appended attached drawing, is described in detail below.
Description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.
Fig. 1 shows the flow chart for the neural network model compression method that the embodiment of the present application one provides;
Fig. 2 shows a kind of being divided training data based on neural network model to be compressed of the offer of the embodiment of the present application two Class as a result, the specific method that target nerve network model is trained flow chart;
Fig. 3 shows a kind of model compression process schematic that the embodiment of the present application two provides;
Fig. 4 shows the embodiment of the present application three provides first flow chart for comparing operation;
Fig. 5 shows that the embodiment of the present application four also provides similar to first eigenvector and the progress of second feature vector Degree matches, and carries out the specific method flow chart of this training in rotation to target nerve network according to the result of similarity mode;
Fig. 6 shows that the similarity that the embodiment of the present application four provides determines the flow chart of operation;
Fig. 7 show that the embodiment of the present application five provides another to first eigenvector and second feature vector into Row similarity mode, and the flow of the specific method of this training in rotation is carried out according to the result of similarity mode to target nerve network Figure;
Fig. 8 shows that the similarity that the embodiment of the present application five provides determines the flow chart of operation;
Fig. 9 shows the flow chart for the neural network model compression method that the embodiment of the present application six provides;
Figure 10 shows the structural schematic diagram for the neural network model compression set that the embodiment of the present application seven provides;
Figure 11 shows a kind of structural schematic diagram for computer equipment that the embodiment of the present application eight provides.
Specific implementation mode
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention Middle attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only It is a part of the embodiment of the present invention, instead of all the embodiments.The present invention being usually described and illustrated herein in the accompanying drawings is real Applying the component of example can be arranged and designed with a variety of different configurations.Therefore, below to provide in the accompanying drawings the present invention The detailed description of embodiment is not intended to limit the range of claimed invention, but is merely representative of the selected reality of the present invention Apply example.Based on the embodiment of the present invention, the institute that those skilled in the art are obtained without making creative work There is other embodiment, shall fall within the protection scope of the present invention.
For ease of understanding the present embodiment, first to a kind of neural network model pressure disclosed in the embodiment of the present invention Contracting method describes in detail, and this method can be used for the compression of the size to various neural network models.
Neural network model compression method shown in Figure 1, that the embodiment of the present application one provides, including:
S101:Training data is inputted into neural network model and target nerve network model to be compressed.
When specific implementation, neural network model to be compressed is the larger neural network model of volume, and has been led to Cross what training data was trained, the neural network mould that neural network ensembles be made of single Neural or multiple are constituted Type.For target nerve network model, with more parameter.Here parameter may include neural network Feature extraction layer the number of plies and/or the parameter involved in every layer of feature extraction layer.
Therefore, it in order to compress neural network model to be compressed, needs training data being input to nerve to be compressed In network model, the feature of training data is learnt using network model to be compressed, is realized to neural network mould to be compressed The training of type obtains the neural network model to be compressed for completing training, and the neural network model to be compressed of completion training is made To need the neural network model compressed.
Target nerve network model is then the good neural network model of advance framework, compared with neural network model to be compressed With less parameter, for example, the number of plies with less feature extraction layer, simpler neural network structure, feature extraction layer With the less parameter of quantity.
Herein, it should be noted that if the neural network model to be compressed is trained using unsupervised training method It arrives, then training data is no label;If the neural network model to be compressed is to use to have the training method of supervision to obtain, instruct Practicing data has label;If the neural network model to be compressed is obtained using transfer learning training method, training data Can also be no label either there is label.
S102:The feature vector and classification results that training data is extracted based on neural network model to be compressed, to target Neural network model is trained, the neural network model compressed;
When specific implementation, training data is inputted into neural network model and target nerve network mould to be compressed Type, is used to classification results of the network model to be compressed to training data, the training of guiding target neural network model, and During being trained to target nerve network model, it is allowed to the classification results to training data, being compressed with band as possible Neural network model is close to the classification results of training data.
Neural network model compression method provided by the embodiments of the present application, compresses to neural network model to be compressed When, the quantity of advance one parameter of framework of meeting is less than the target nerve net of the quantity of neural network model parameter to be compressed Then training data is input in neural network model and target nerve network model to be compressed by network, be based on god to be compressed The feature vector and classification results, guiding target neural network model extracted to training data through network model are trained, obtain To compression neural network model, this process is operated on band compression neural network model, and finally obtained Compression neural network model and neural network model to be compressed should be to the classification results of same training data it is identical, in turn The loss of precision will not be caused during model compression, it is thus possible under the premise of ensureing precision, to the size of model It is compressed, meets the dual requirements for precision and moulded dimension.
Specifically, network model to be compressed generally comprises:Neural network to be compressed and grader to be compressed.Target nerve net Network model generally comprises:Target nerve network and object classifiers;The obtained compression neural network model of training includes:Compression god Through network and compression grader.
Shown in Figure 2, the embodiment of the present application two also provides one kind based on neural network model to be compressed to training data Classification results, to the specific method that target nerve network model is trained, including:
S201:First eigenvector is extracted using the training data that neural network to be compressed is input, and uses target god Through the training data extraction second feature vector that network is input.
S202:Similarity mode is carried out to first eigenvector and second feature vector, and according to similarity mode As a result epicycle training is carried out to target nerve network;
S203:First eigenvector is input to grader to be compressed, obtains the first classification results;
Second feature vector is input to object classifiers, obtains the second classification results;
S204:According to the comparison result of the first classification results and the second classification results, to target nerve network and target Grader carries out epicycle training;
S205:By carrying out more wheel training to target nerve network and object classifiers, compression neural network mould is obtained Type.
When specific implementation, in model compression process schematic shown in Figure 3, for convenience to the application reality It applies example to be described, introduces two function modules, similarity mode module and comparing module in this embodiment.Wherein, similar Matching module is spent to be used to first eigenvector and second feature vector carrying out similarity mode;Comparing module is used for first point Class result and the second classification results are compared.
Training data is input to neural network model and target nerve network model to be compressed.Training data is input to After neural network model to be compressed, two processes can be executed, first, neural network to be compressed can carry out feature to training data Extraction, obtains the first eigenvector of training data;Then first eigenvector is transmitted to grader to be compressed, to be compressed point Class device is based on first eigenvector, classifies to the training data of first eigenvector characterization, obtains the first classification results.
Similarly, training data is input to after target nerve network model, can also execute two processes, first, target Neural network carries out feature extraction to training data, obtains the second feature vector of training data;Then by second feature vector Object classifiers are transmitted to, object classifiers are based on second feature vector, are carried out to the training data of second feature vector characterization Classification, obtains the second classification results.
To the process that neural network model to be compressed is compressed, actually realizes and pass through neural network mould to be compressed The training of type guiding target neural network model so that the compression neural network model and neural network mould to be compressed that training obtains The result that type classifies to same training data is consistent, that is, neural network to be compressed and compression neural network are to same instruction When practicing data and carrying out feature extraction, similarity between obtained feature vector will be as close as;Meanwhile to be compressed point Class device and compression grader be based respectively on as close possible to feature vector, when classifying to the training data that it is characterized, point The result of class is consistent.Thus, it, be to target nerve network and mesh when being trained to target nerve network model Mark grader is trained.
In the training process, the parameter of target nerve network can by first eigenvector and second feature vector it is similar The influence for spending matching result, according to similarity mode as a result, the parameter of adjustment target nerve network.Due to target nerve network With the difference of parameter in neural network to be compressed, first eigenvector is caused to be extremely difficult to second feature vector consistent.Thus It needs as possible so that the second feature vector that extract to training data of target nerve network being approached to first eigenvector as possible; Meanwhile the parameter of target nerve network also suffers from the of the training data classification that object classifiers characterize second feature vector The influence of two classification results will adjust the ginseng of target nerve network in the second classification results and inconsistent the first classification results Number so that the second classification results that object classifiers obtain are consistent with the first classification results.
The parameter of object classifiers can be influenced by the first classification results and the second classification results comparison result, first When classification results and the second classification record a demerit inconsistent, the parameter of object classifiers is adjusted so that the second classification results and first point Class result is consistent.
In turn, after training data to be inputted to neural network model and target nerve network model to be compressed, make first First eigenvector is extracted with the training data that neural network to be compressed is data, and uses the instruction that target nerve network is input Practice data extraction second feature vector, the first eigenvector of same training data and second feature vector are then transmitted to phase Like degree matching module, similarity mode is carried out to first eigenvector and second feature vector using similarity mode module, and Epicycle training is carried out to target nerve network according to similarity mode;Meanwhile first eigenvector is input to classification to be compressed Device obtains the first classification results, and second feature vector is input to object classifiers, obtains the second classification results, then will First classification results and the second classification results are transmitted to comparing module, are classified using the first classification results of comparing module pair and second As a result it is compared, and according to comparison as a result, carrying out epicycle training to target nerve network and object classifiers.
By carrying out more wheel training to target nerve network and object classifiers, compression neural network model is obtained.
It is noted herein that epicycle training refers to being instructed to target nerve network model using same training data Practice, until target nerve network carries out training data the second feature vector that feature extraction obtains, and is classified to obtain The second classification results be satisfied by preset condition;More wheel training refer to, using multiple training datas to target nerve network into Row training, each training data carry out a wheel to target nerve network and train.
Specifically, the embodiment of the present application three also provides a kind of comparison knot according to the first classification results and the second classification results Fruit carries out target nerve network and object classifiers the specific method of epicycle training, including:It executes following first and compares behaviour Make, until the Classification Loss of target nerve network model meets default loss range, completes to target nerve network and target The epicycle of grader is trained.
Shown in Figure 4, first, which compares operation, includes:
S401:It compares the first classification results and whether the second classification results is consistent;If it is, jumping to S402;If It is no, then jump to S403.
S402:It completes to train the epicycle of target nerve network and object classifiers;This flow terminates.
S403:Generate the first feedback information, and based on the first feedback information to target nerve network and object classifiers into Row parameter adjustment;
S404:The use of target nerve network and object classifiers is that training data determines new based on the parameter after adjustment The second classification results, and execute S401 again.
When specific implementation, it is ensured that target nerve network model after excessive training in rotation white silk, obtained compression is refreshing Precision through network model will ensure to compress neural network model and neural network model to be compressed to same training data Classification results are consistent.Therefore, for the first classification results and the second classification results are compared with comparing module.When than When inconsistent to result, the first feedback information is generated, is based on the first feedback information, to target nerve network and object classifiers Parameter is adjusted, and has been adjusted target nerve network and object classifiers after parameter;It reuses after having adjusted parameter Target nerve network and object classifiers determine the second new classification results for training data, then based on the first classification results and newly The second classification results carry out it is above-mentioned first compare operation, repeat the above process, until the first classification results and second classification knot Fruit is consistent.
In addition, shown in Figure 5, the embodiment of the present application four also provide it is a kind of to first eigenvector and second feature to Amount carries out similarity mode, and carries out the specific method of this training in rotation to target nerve network according to the result of similarity mode, packet It includes:
S501:First eigenvector and second feature vector are clustered respectively;
S502:According to first eigenvector clustered as a result, generate the first adjacency matrix;According to second feature It is that vector is clustered as a result, generate the second adjacency matrix;
S504:According to the similarity between the first adjacency matrix and the second adjacency matrix, to the parameter of target network into Row epicycle is trained.
When specific implementation, first eigenvector can be regarded to the point being mapped in higher dimensional space as, according to point The distance between point respectively clusters these points, distance is divided into the point within predetermined threshold value in same class, so Afterwards according to cluster as a result, forming the first adjacency matrix of distance between point-to-point.
In the first adjacency matrix, if two points belong to same class in cluster, distance between the two is 1;Such as Two points of fruit are not belonging to same class in cluster, then the distance between 2 points are 0.
For example, training data has 5, obtained first eigenvector is respectively:1、2、3、4、5.Wherein, to fisrt feature The result that vector is clustered is:{ 1,3 }, { 2 }, { 4,5 }, the then adjacency matrix formed are:
According to second feature vector clustered as a result, to form the second adjacency matrix similar to the above, therefore no longer It repeats.
The embodiment of the present application five also provides a kind of similarity according between the first adjacency matrix and the second adjacency matrix, To the method that the parameter of target network carries out epicycle training, this method includes:It executes following similarity and determines operation, until first Similarity between adjacency matrix and the second adjacency matrix is less than preset first similarity threshold, completes to target nerve network Epicycle training;
Shown in Figure 6, similarity determines that operation includes:
S601:Compare whether the similarity between the first adjacency matrix and the second adjacency matrix is less than preset first phase Like degree threshold value.If it is, executing S602;If it is not, then carrying out S603.
Herein, it when specific implementation, is calculating between currently available the first adjacency matrix and the second adjacency matrix Similarity when, calculate the first adjacency matrix mark and the second adjacency matrix mark, the mark of the first adjacency matrix and second adjoining The distance between mark of matrix is closer, then the similarity between the first adjacency matrix and the second adjacency matrix is higher.To first It, can be by the mark of the first adjacency matrix and when the distance between the mark of the mark of adjacency matrix and the second adjacency matrix is solved Difference between the mark of two adjacency matrix as between the first adjacency matrix and the second adjacency matrix similarity namely the first adjoining Absolute value of the difference between the mark of matrix and the mark of the second adjacency matrix is bigger, the phase of the first adjacency matrix and the second adjacency matrix It is lower like spending.
S602:It completes to train the epicycle of target nerve network.This flow terminates.
S603:The first feedback information is generated, and parameter adjustment is carried out to target nerve network based on the first feedback information;
S604:The use of target nerve network is that training data extracts new second feature vector based on the parameter after adjustment; New second feature vector is clustered, generates the second new adjacency matrix, and execute S601 again.
When specific implementation, since the similarity between the first adjacency matrix and the second adjacency matrix is higher, then the The classification results classified to first eigenvector and the second adjacency matrix of one adjacency matrix characterization characterize special to second The classification results that sign vector is classified are more similar, therefore will be according to similar between the first adjacency matrix and the second adjacency matrix Degree carries out parameter adjustment to target nerve network so that target nerve network is carrying out what feature extraction obtained to training data Second feature vector is become closer in the first spy obtained to training data progress feature extraction using neural network to be compressed Sign vector.
In addition, shown in Figure 7, it is special to first eigenvector and second that the embodiment of the present application five also provides another Sign vector carries out similarity mode, and carries out the specific side of this training in rotation to target nerve network according to the result of similarity mode Method, this method include:
S701:The operation for first eigenvector and second feature vector reduce dimension respectively, obtains fisrt feature Second dimensionality reduction feature vector of the first dimensionality reduction feature vector and second feature vector of vector.
When specific implementation, to first eigenvector and second feature vector reduce the operation of dimension, it can be with By carrying out recompiling acquisition to first eigenvector and second feature vector, such as using a full articulamentum, to first Feature vector and second feature vector carry out a Feature capturing again, obtain the first dimensionality reduction feature vector and the second dimensionality reduction feature Vector.
S702:Calculate the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector.
Herein, it when calculating the similarity between the first dimensionality reduction feature vector and the second dimensionality reduction feature vector, may be used The difference between two vectors is calculated, and using the difference between two as the result of similarity.Alternatively, can be directly in the first drop Dimensional feature vector and the second dimensionality reduction feature vector will carry out the result of subtraction as similar into the subtraction between row element and element The result of degree;Alternatively, can also regard the first dimensionality reduction feature vector and the second dimensionality reduction feature vector a little as, corresponding sky is projected Between in, calculate point distribution between difference.For example, the first dimensionality reduction feature vector and the second dimensionality reduction feature vector are projected correspondence Space in, obtained point is respectively:S(X1, Y1, Z1), M (X2, Y2, Z2) by the distance between two points L=(X1-X2)2+ (Y1-Y2)2+(Z1-Z2)2Similarity as the two;Apart from smaller, similarity is bigger.
S703:According to the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector to the parameter of target network into Row epicycle is trained.
Herein, according to the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector to the parameter of target network into Row epicycle is trained, and is actually to ensure that the similarity between the first dimensionality reduction feature vector and the second dimensionality reduction feature vector default The second similarity threshold within.Specifically
Following similarity can be executed and determine operation, until the phase of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector It is less than preset second similarity threshold like degree, completes to train the epicycle of target nerve network.
Shown in Figure 8, similarity determines that operation includes:
S801:It is default whether the similarity between the first dimensionality reduction feature vector and the second dimensionality reduction feature vector is less than by comparison The second similarity threshold;If it is, executing S302;If it is not, then executing S803.
Herein, the similarity calculating method between the first dimensionality reduction feature vector and the second dimensionality reduction feature vector may refer to The description of S702 is stated, details are not described herein.
S802:It completes to train the epicycle of target nerve network.This flow terminates.
S803:The second feedback information is generated, and parameter adjustment is carried out to target nerve network based on the second feedback information.
S804:The use of target nerve network is that training data extracts new second feature vector based on the parameter after adjustment. The operation for new second feature vector reduce dimension, generates the second new dimensionality reduction feature vector, and execute S801 again.
Specifically, it is ensured that first eigenvector and second feature vector are as close as the phase it is necessary to both make It is less than certain threshold value like degree, namely the similarity between the first dimensionality reduction feature vector and the second dimensionality reduction feature vector is less than Preset second similarity threshold.It is when similarity between the two is not less than preset second similarity threshold, i.e., corresponding The second feedback information is generated, and carries out the adjustment of parameter to target nerve network based on second feedback information so that target god Through network when being that training data extracts second feature vector again, it can be dropped towards the first dimensionality reduction feature vector and second is increased The direction change of similarity between dimensional feature vector.Then it is training number to reuse the target nerve network after having adjusted parameter It is vectorial according to new second feature is extracted, and reduction dimension operation is carried out to new second feature vector again, generate new second Dimensionality reduction feature vector, and execute similarity calculation operation again, until the first dimensionality reduction feature vector and the second dimensionality reduction feature to Similarity between amount is less than preset second similarity threshold.
The compression neural network model obtained using the embodiment of the present application one, it is ensured that compress the essence of neural network model It spends and is consistent with the precision of neural network model to be compressed;Unsupervised learning or transfer learning training method are obtained For network model to be compressed, if neural network model to be compressed is wrong to the classification of certain training data, in certain journey It is wrong to the classification of the training data that compression neural network model is also resulted on degree.The embodiment of the present application six also provides separately A kind of outer neural network model compression method can further increase the precision of compression neural network model.
It is shown in Figure 9, the embodiment of the present application six provide neural network model compression method, to first eigenvector with And before second feature vector carries out similarity mode, further include:
S901:Noise addition operation is carried out to first eigenvector.
When specific implementation, noise addition is carried out to first eigenvector, is to increase the compression that training obtains The generalization ability of neural network model.Generalization ability refers to the adaptability for referring to machine learning algorithm to fresh sample.To When one feature vector carries out noise addition operation, the multiple different degrees of noise of degree can be carried out to first eigenvector, Or the repeatedly addition of variety classes noise.The addition of noise each time can all generate fisrt feature that one is added to noise to Amount, each is added to the first eigenvector of noise, can all cause to a degree of offset of original first eigenvector, from And a training data is enable to obtain the first eigenvector of a variety of offsets.Meanwhile it can also enrich first eigenvector Data volume reduces the training data of input in the case where first eigenvector data volume is constant, data can be allowed preferably to intend It closes.In addition, neural network model to be compressed for certain training datas classification not necessarily very accurately, therefore, increase The mutation of first eigenvector may to be added to the first eigenvector after noise and more be intended to reality, realize to mesh The training of mark neural network model is preferably guided.
It is general to use construction with first eigenvector with identical when carrying out noise addition to first eigenvector Noise is added to by the noise vector of dimension in such a way that first eigenvector is added with noise vector corresponding position data In first eigenvector.
When construction has the noise vector of identical dimensional with first eigenvector, can directly construct, it can also be indirect Construction.Directly construction refer to directly generate with and noise vector of the first eigenvector with identical dimensional, for example, ought the When the dimension of one feature vector is 1 × 1000, the noise vector of construction is also 1 × 1000 dimension.Indirect configuration refers to generating dimension Less than the noise vector of first eigenvector, then by the way of to noise vector zero filling, generate dimension and fisrt feature to Measure identical noise vector;For example, when the dimension of first eigenvector is 1 × 1000, the intermediate noise vector of construction is 1 × 500;0 is filled out in any position of intermediate noise vector, ultimately forms the noise vector that dimension is 1 × 1000.
In addition, since repeatedly different degrees of noise can be carried out to first eigenvector, or repeatedly variety classes are made an uproar The addition of sound, the different noise of degree, the mode that parameter in change noise generation algorithm may be used obtain;Or using indirect The method for constructing noise vector is obtained in the method that different location fills out 0;Different types of noise, can be by changing noise life It is obtained at the mode of algorithm.
S902:The first eigenvector for being added to noise and second feature vector are subjected to similarity mode.
The method for being added to the first eigenvector and second feature vector progress similarity mode of noise, makes an uproar with being not added with The first eigenvector of sound is similar with the second feature vector progress method of similarity mode, specifically may refer to above-mentioned statement, Details are not described herein.
In addition, in this embodiment, due to being added to noise to first eigenvector, using grader to be compressed to addition When first eigenvector after noise is classified, the classification of classification results and original first eigenvector may result in As a result different, if not being modified to it, the precision of finally obtained compression neural network model can be caused to be affected.
Therefore, in the embodiment of the present application, carried out in the first eigenvector and second feature vector that will be added to noise Similarity mode will also execute following second while completion based on similarity mode result to the training of target nerve network Operation is compared, until the first classification results are consistent with the label of training data, is completed to neural network to be compressed and to be compressed The epicycle of grader is trained;
Second, which compares operation, includes:
The label of first classification results and training data is compared;
For the inconsistent situation of comparison result, scattered feedback information is generated, and based on scattered feedback information to be compressed Neural network and grader to be compressed carry out parameter adjustment;
The use of neural network to be compressed and grader to be compressed is that training data extracts newly based on the parameter after adjustment First classification results, and execute second again and compare operation.
The fine tuning to neural network model to be compressed may be implemented through the above steps, make neural network model to be compressed with And the compression neural network model that training obtains can have better generalization ability and higher precision.
Based on same inventive concept, god corresponding with neural network model compression method is additionally provided in the embodiment of the present invention Through network model compression set, the principle solved the problems, such as due to the device in the embodiment of the present invention and the above-mentioned god of the embodiment of the present invention It is similar through network model compression method, therefore the implementation of device may refer to the implementation of method, overlaps will not be repeated.
Neural network model compression set shown in Figure 10, that the embodiment of the present invention seven provides, specifically includes:
Input module 11, for training data to be inputted neural network model and target nerve network model to be compressed;
First training module 12, feature vector for extract to training data based on neural network model to be compressed and is divided Class obtains compression neural network model as a result, be trained to target nerve network model;
Wherein, the quantity of target nerve network model parameter is less than the quantity of neural network model parameter to be compressed.
Neural network model compression set provided by the embodiments of the present application, compresses to neural network model to be compressed When, the quantity of advance one parameter of framework of meeting is less than the target nerve net of the quantity of neural network model parameter to be compressed Then training data is input in neural network model and target nerve network model to be compressed by network, be based on god to be compressed The feature vector and classification results, guiding target neural network model extracted to training data through network model are trained, obtain To compression neural network model, this process is operated on band compression neural network model, and finally obtained Compression neural network model and neural network model to be compressed should be to the classification results of same training data it is identical, in turn The loss of precision will not be caused during model compression, it is thus possible under the premise of ensureing precision, to the size of model It is compressed, meets the dual requirements for precision and moulded dimension.
Optionally, further include:Second training module 13, for by training data input neural network model to be compressed with And before target nerve network model, training data is inputted into neural network model to be compressed, to neural network model to be compressed It is trained, obtains the neural network model to be compressed for completing training.
Optionally, neural network model to be compressed includes:Neural network to be compressed and grader to be compressed;Target nerve net Network model includes:Target nerve network and object classifiers;
First training module 12, is specifically used for:It is special using the training data extraction first that neural network to be compressed is input Sign vector, and use the training data extraction second feature vector that target nerve network is input;
Similarity mode is carried out to first eigenvector and second feature vector, and according to the result pair of similarity mode Target nerve network carries out epicycle training;And
First eigenvector is input to grader to be compressed, obtains the first classification results;
Second feature vector is input to object classifiers, obtains the second classification results;
According to the comparison result of the first classification results and the second classification results, to target nerve network and object classifiers Carry out epicycle training;
By carrying out more wheel training to target nerve network and object classifiers, compression neural network model is obtained.
Optionally, the first training module 12 is specifically used for comparing operation by executing following first, until target nerve net The Classification Loss of network model meets default loss range, completes to train the epicycle of target nerve network and object classifiers;
First, which compares operation, includes:
First classification results and the second classification results are compared;
For the inconsistent situation of comparison result, the first feedback information is generated, and based on the first feedback information to target god Parameter adjustment is carried out through network and object classifiers;
The use of target nerve network and object classifiers is that training data determines new second based on the parameter after adjustment Classification results, and execute first again and compare operation.
Optionally, the first training module 12 is additionally operable to:Similarity is carried out to first eigenvector and second feature vector Before matching, noise addition operation is carried out to first eigenvector;The first eigenvector and second feature of noise will be added to Vector carries out similarity mode.
Optionally, the first training module 12 be specifically used for by describe step to first eigenvector and second feature to Amount carries out similarity mode, and carries out epicycle training to target nerve network according to the result of similarity mode:Respectively to first Feature vector and second feature vector are clustered;
According to first eigenvector clustered as a result, generate the first adjacency matrix;
According to second feature vector clustered as a result, generate the second adjacency matrix;
According to the similarity between the first adjacency matrix and the second adjacency matrix, epicycle is carried out to the parameter of target network Training.
Optionally, the first training module 12 is specifically used for determining operation by executing following similarity, until the first adjoining Similarity between matrix and the second adjacency matrix is less than preset first similarity threshold, completes the sheet to target nerve network Wheel training;
Similarity determines that operation includes:
Calculate the similarity between currently available the first adjacency matrix and the second adjacency matrix;
The case where being not less than preset first similarity threshold for similarity, generates the first feedback information, and based on the One feedback information carries out parameter adjustment to target nerve network;
The use of target nerve network is that training data extracts new second feature vector based on the parameter after adjustment;
New second feature vector is clustered, generates the second new adjacency matrix, and execute similarity calculation again Operation.
Optionally, the first training module 12 be specifically used for by following step be first eigenvector and second feature to Amount carries out similarity mode, and carries out epicycle training to target nerve network according to the result of similarity mode:
The operation for first eigenvector and second feature vector reduce dimension respectively, obtains first eigenvector Second dimensionality reduction feature vector of the first dimensionality reduction feature vector and second feature vector;
Calculate the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector;
This is carried out to the parameter of target network according to the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector Wheel training.
Optionally, the first training module 12 is specifically used for executing following similarity determination operation, until the first dimensionality reduction feature The similarity of vector sum the second dimensionality reduction feature vector is less than preset second similarity threshold, completes the sheet to target nerve network Wheel training;
Similarity determines that operation includes:
Calculate the similarity between currently available the first dimensionality reduction feature vector and the second dimensionality reduction feature vector;
The case where being not less than preset second similarity threshold for similarity, generates the second feedback information, and based on the Two feedback informations carry out parameter adjustment to target nerve network;
The use of target nerve network is that training data extracts new second feature vector based on the parameter after adjustment;
The operation for new second feature vector reduce dimension, generates the second new dimensionality reduction feature vector, and again Execute similarity calculation operation.
Corresponding to the neural network model compression method in Fig. 1, the embodiment of the present invention eight additionally provides a kind of computer and sets Standby, as shown in figure 11, which includes memory 1000, processor 2000 and is stored on the memory 1000 and can be at this The computer program run on reason device 2000, wherein above-mentioned processor 2000 realizes above-mentioned god when executing above computer program The step of through network model compression method.
Specifically, above-mentioned memory 1000 and processor 2000 can be general memory and processor, not do here It is specific to limit, when the computer program of 2000 run memory 1000 of processor storage, it is able to carry out above-mentioned neural network mould Type compression method carries out model compression to solve existing model compression method premised on the precision for sacrificing model, can not The problem of reaching to precision use demand, and then reach in the case where ensureing neural network model precision, to the size of model The effect compressed.
Corresponding to the neural network model compression method in Fig. 1, the embodiment of the present invention nine additionally provides a kind of computer can Storage medium is read, computer program is stored on the computer readable storage medium, when which is run by processor The step of executing above-mentioned neural network model compression method.
Specifically, which can be general storage medium, such as mobile disk, hard disk, on the storage medium Computer program when being run, above-mentioned neural network model compression method is able to carry out, to solve existing model compression Method carries out model compression premised on the precision for sacrificing model, the problem of being unable to reach to precision use demand, and then reaches In the case where ensureing neural network model precision, effect that the size of model is compressed.
The computer program product of neural network model compression method and device that the embodiment of the present invention is provided, including The computer readable storage medium of program code is stored, the instruction that program code includes can be used for executing previous methods embodiment In method, specific implementation can be found in embodiment of the method, details are not described herein.
It is apparent to those skilled in the art that for convenience and simplicity of description, the system of foregoing description It with the specific work process of device, can refer to corresponding processes in the foregoing method embodiment, details are not described herein.
If function is realized in the form of SFU software functional unit and when sold or used as an independent product, can store In a computer read/write memory medium.Based on this understanding, technical scheme of the present invention is substantially in other words to existing There is the part for the part or the technical solution that technology contributes that can be expressed in the form of software products, the computer Software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be personal meter Calculation machine, server or network equipment etc.) execute all or part of step of each embodiment method of the present invention.And it is above-mentioned Storage medium includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory The various media that can store program code such as (RAM, Random Access Memory), magnetic disc or CD.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. a kind of neural network model compression method, which is characterized in that this method includes:
Training data is inputted into neural network model and target nerve network model to be compressed;
The feature vector and classification results that the training data is extracted based on the neural network model to be compressed, to target god It is trained through network model, obtains compression neural network model;
Wherein, the quantity of the target nerve network model parameter is less than the quantity of the neural network model parameter to be compressed.
2. according to the method described in claim 1, it is characterized in that, by training data input neural network model to be compressed with And before target nerve network model, further include:
The training data is inputted into the neural network model to be compressed, the neural network model to be compressed is instructed Practice, obtains the neural network model to be compressed for completing training.
3. according to the method described in claim 1, it is characterized in that, the neural network model to be compressed includes:God to be compressed Through network and grader to be compressed;The target nerve network model includes:Target nerve network and object classifiers;
The feature vector that the training data is extracted based on the neural network model to be compressed and classification results, to mesh Mark neural network model is trained, and is obtained compression neural network model, is specifically included:
First eigenvector is extracted using the training data that the neural network to be compressed is input, and uses the target nerve Network is the training data extraction second feature vector of input;
Similarity mode is carried out to the first eigenvector and second feature vector, and according to the result pair of similarity mode The target nerve network carries out epicycle training;And
The first eigenvector is input to the grader to be compressed, obtains the first classification results;
The second feature vector is input to the object classifiers, obtains the second classification results;
According to the comparison result of first classification results and second classification results, to the target nerve network and institute It states object classifiers and carries out epicycle training;
By carrying out more wheel training to the target nerve network and the object classifiers, compression neural network mould is obtained Type.
4. according to the method described in claim 3, it is characterized in that, executing following first compares operation, until target god Classification Loss through network model meets default loss range, completes to the target nerve network and the object classifiers Epicycle training;
Described first, which compares operation, includes:
First classification results and second classification results are compared;
For the inconsistent situation of comparison result, the first feedback information is generated, and based on first feedback information to the mesh It marks neural network and the object classifiers carries out parameter adjustment;
The use of target nerve network and object classifiers is that the training data determines new second based on the parameter after adjustment Classification results, and execute described first again and compare operation.
5. according to the method described in claim 3, it is characterized in that, described to the first eigenvector and described second special Before sign vector carries out similarity mode, further include:
Noise addition operation is carried out to the first eigenvector;
It is described that similarity mode is carried out to the first eigenvector and the second feature vector, it specifically includes:
The first eigenvector for being added to noise and the second feature vector are subjected to similarity mode.
6. according to the method described in claim 3-5 any one, which is characterized in that it is described to the first eigenvector and Second feature vector carries out similarity mode, and carries out this training in rotation to the target nerve network according to the result of similarity mode Practice, specifically includes:
The first eigenvector and the second feature vector are clustered respectively;
According to the first eigenvector clustered as a result, generate the first adjacency matrix;
According to the second feature vector clustered as a result, generate the second adjacency matrix;
According to the similarity between first adjacency matrix and the second adjacency matrix, the parameter of the target network is carried out Epicycle is trained.
7. according to the method described in claim 6, it is characterized in that, executing following similarity determines operation, until the first adjoining Similarity between matrix and the second adjacency matrix is less than preset first similarity threshold, completes to the target nerve network Epicycle training;
The similarity determines that operation includes:
Calculate the similarity between currently available the first adjacency matrix and the second adjacency matrix;
The case where being not less than preset first similarity threshold for similarity, generates the first feedback information, and based on described the One feedback information carries out parameter adjustment to the target nerve network;
The use of target nerve network is that the training data extracts new second feature vector based on the parameter after adjustment;
New second feature vector is clustered, generates the second new adjacency matrix, and execute the similarity calculation again Operation.
8. according to the method described in claim 3-5 any one, which is characterized in that it is described to the first eigenvector and Second feature vector carries out similarity mode, and carries out this training in rotation to the target nerve network according to the result of similarity mode Practice, specifically includes:
The operation for first eigenvector and second feature vector reduce dimension respectively, obtains the first of first eigenvector Second dimensionality reduction feature vector of dimensionality reduction feature vector and second feature vector;
Calculate the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector;
According to the similarity of the first dimensionality reduction feature vector and the second dimensionality reduction feature vector to the ginseng of the target network Number carries out epicycle training.
9. according to the method described in claim 8, it is characterized in that, executing following similarity determines operation, until the first dimensionality reduction The similarity of feature vector and the second dimensionality reduction feature vector is less than preset second similarity threshold, completes to the target The epicycle of neural network is trained;
The similarity determines that operation includes:
Calculate the similarity between currently available the first dimensionality reduction feature vector and the second dimensionality reduction feature vector;
The case where being not less than preset second similarity threshold for similarity, generates the second feedback information, and based on described the Two feedback informations carry out parameter adjustment to the target nerve network;
The use of target nerve network is that the training data extracts new second feature vector based on the parameter after adjustment;
The operation for new second feature vector reduce dimension, generates the second new dimensionality reduction feature vector, and execute again The similarity calculation operation.
10. a kind of neural network model compression set, which is characterized in that the device includes:
Input module, for training data to be inputted neural network model and target nerve network model to be compressed;
Training module, the feature vector for being extracted to the training data based on the neural network model to be compressed and classification As a result, being trained to target nerve network model, compression neural network model is obtained;
Wherein, the quantity of the target nerve network model parameter is less than the quantity of the neural network model parameter to be compressed.
CN201810274146.3A 2018-03-29 2018-03-29 Neural network model compression method and device Active CN108510083B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810274146.3A CN108510083B (en) 2018-03-29 2018-03-29 Neural network model compression method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810274146.3A CN108510083B (en) 2018-03-29 2018-03-29 Neural network model compression method and device

Publications (2)

Publication Number Publication Date
CN108510083A true CN108510083A (en) 2018-09-07
CN108510083B CN108510083B (en) 2021-05-14

Family

ID=63379557

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810274146.3A Active CN108510083B (en) 2018-03-29 2018-03-29 Neural network model compression method and device

Country Status (1)

Country Link
CN (1) CN108510083B (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008880A (en) * 2019-03-27 2019-07-12 深圳前海微众银行股份有限公司 A kind of model compression method and device
CN110163236A (en) * 2018-10-15 2019-08-23 腾讯科技(深圳)有限公司 The training method and device of model, storage medium, electronic device
CN110211121A (en) * 2019-06-10 2019-09-06 北京百度网讯科技有限公司 Method and apparatus for pushing model
US20200019840A1 (en) * 2018-07-13 2020-01-16 Arizona Board Of Regents On Behalf Of Arizona State University Systems and methods for sequential event prediction with noise-contrastive estimation for marked temporal point process
CN110929839A (en) * 2018-09-20 2020-03-27 深圳市商汤科技有限公司 Method and apparatus for training neural network, electronic device, and computer storage medium
WO2020108368A1 (en) * 2018-11-29 2020-06-04 华为技术有限公司 Neural network model training method and electronic device
CN111242273A (en) * 2018-11-29 2020-06-05 华为终端有限公司 Neural network model training method and electronic equipment
WO2020231049A1 (en) 2019-05-16 2020-11-19 Samsung Electronics Co., Ltd. Neural network model apparatus and compressing method of neural network model
CN112020724A (en) * 2019-04-01 2020-12-01 谷歌有限责任公司 Learning compressible features
CN112288032A (en) * 2020-11-18 2021-01-29 上海依图网络科技有限公司 Method and device for quantitative model training based on generation of confrontation network
CN113505774A (en) * 2021-07-14 2021-10-15 青岛全掌柜科技有限公司 Novel policy identification model size compression method
WO2022027937A1 (en) * 2020-08-06 2022-02-10 苏州浪潮智能科技有限公司 Neural network compression method, apparatus and device, and storage medium
WO2022062828A1 (en) * 2020-09-23 2022-03-31 深圳云天励飞技术股份有限公司 Image model training method, image processing method, chip, device and medium
EP3935578A4 (en) * 2019-05-16 2022-06-01 Samsung Electronics Co., Ltd. Neural network model apparatus and compressing method of neural network model
CN115526266A (en) * 2022-10-18 2022-12-27 支付宝(杭州)信息技术有限公司 Model training method and device, and business prediction method and device
CN118015343A (en) * 2024-01-18 2024-05-10 中移信息系统集成有限公司 Image filtering method and device and electronic equipment

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331738A (en) * 2014-10-21 2015-02-04 西安电子科技大学 Network reconfiguration algorithm based on game theory and genetic algorithm
CN104661037A (en) * 2013-11-19 2015-05-27 中国科学院深圳先进技术研究院 Tampering detection method and system for compressed image quantization table
WO2015089148A2 (en) * 2013-12-13 2015-06-18 Amazon Technologies, Inc. Reducing dynamic range of low-rank decomposition matrices
US20150178620A1 (en) * 2011-07-07 2015-06-25 Toyota Motor Europe Nv/Sa Artificial memory system and method for use with a computational machine for interacting with dynamic behaviours
CN106096670A (en) * 2016-06-17 2016-11-09 北京市商汤科技开发有限公司 Concatenated convolutional neural metwork training and image detecting method, Apparatus and system
CN106251347A (en) * 2016-07-27 2016-12-21 广东工业大学 subway foreign matter detecting method, device, equipment and subway shield door system
CN106503799A (en) * 2016-10-11 2017-03-15 天津大学 Deep learning model and the application in brain status monitoring based on multiple dimensioned network
EP3168781A1 (en) * 2015-11-16 2017-05-17 Samsung Electronics Co., Ltd. Method and apparatus for recognizing object, and method and apparatus for training recognition model
CN106778684A (en) * 2017-01-12 2017-05-31 易视腾科技股份有限公司 deep neural network training method and face identification method
CN106845381A (en) * 2017-01-16 2017-06-13 西北工业大学 Sky based on binary channels convolutional neural networks composes united hyperspectral image classification method
US20170249536A1 (en) * 2016-02-29 2017-08-31 Christopher J. Hillar Self-Organizing Discrete Recurrent Network Digital Image Codec
US20170337711A1 (en) * 2011-03-29 2017-11-23 Lyrical Labs Video Compression Technology, LLC Video processing and encoding

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170337711A1 (en) * 2011-03-29 2017-11-23 Lyrical Labs Video Compression Technology, LLC Video processing and encoding
US20150178620A1 (en) * 2011-07-07 2015-06-25 Toyota Motor Europe Nv/Sa Artificial memory system and method for use with a computational machine for interacting with dynamic behaviours
CN104661037A (en) * 2013-11-19 2015-05-27 中国科学院深圳先进技术研究院 Tampering detection method and system for compressed image quantization table
WO2015089148A2 (en) * 2013-12-13 2015-06-18 Amazon Technologies, Inc. Reducing dynamic range of low-rank decomposition matrices
CN104331738A (en) * 2014-10-21 2015-02-04 西安电子科技大学 Network reconfiguration algorithm based on game theory and genetic algorithm
EP3168781A1 (en) * 2015-11-16 2017-05-17 Samsung Electronics Co., Ltd. Method and apparatus for recognizing object, and method and apparatus for training recognition model
US20170249536A1 (en) * 2016-02-29 2017-08-31 Christopher J. Hillar Self-Organizing Discrete Recurrent Network Digital Image Codec
CN106096670A (en) * 2016-06-17 2016-11-09 北京市商汤科技开发有限公司 Concatenated convolutional neural metwork training and image detecting method, Apparatus and system
CN106251347A (en) * 2016-07-27 2016-12-21 广东工业大学 subway foreign matter detecting method, device, equipment and subway shield door system
CN106503799A (en) * 2016-10-11 2017-03-15 天津大学 Deep learning model and the application in brain status monitoring based on multiple dimensioned network
CN106778684A (en) * 2017-01-12 2017-05-31 易视腾科技股份有限公司 deep neural network training method and face identification method
CN106845381A (en) * 2017-01-16 2017-06-13 西北工业大学 Sky based on binary channels convolutional neural networks composes united hyperspectral image classification method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JIAN-HAO LUO等: "ThiNet: A Filter Level Pruning Method for Deep Neural Network Compression", 《COMPUTER VISION FOUNDATION》 *
王征韬: "深度神经网络压缩与优化研究", 《中国优秀硕士学位论文全文数据库(信息科技辑)》 *

Cited By (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200019840A1 (en) * 2018-07-13 2020-01-16 Arizona Board Of Regents On Behalf Of Arizona State University Systems and methods for sequential event prediction with noise-contrastive estimation for marked temporal point process
US12014267B2 (en) * 2018-07-13 2024-06-18 Arizona Board Of Regents On Behalf Of Arizona State University Systems and methods for sequential event prediction with noise-contrastive estimation for marked temporal point process
CN110929839B (en) * 2018-09-20 2024-04-16 深圳市商汤科技有限公司 Method and device for training neural network, electronic equipment and computer storage medium
CN110929839A (en) * 2018-09-20 2020-03-27 深圳市商汤科技有限公司 Method and apparatus for training neural network, electronic device, and computer storage medium
CN110163236B (en) * 2018-10-15 2023-08-29 腾讯科技(深圳)有限公司 Model training method and device, storage medium and electronic device
CN110163236A (en) * 2018-10-15 2019-08-23 腾讯科技(深圳)有限公司 The training method and device of model, storage medium, electronic device
CN111242273B (en) * 2018-11-29 2024-04-12 华为终端有限公司 Neural network model training method and electronic equipment
WO2020108368A1 (en) * 2018-11-29 2020-06-04 华为技术有限公司 Neural network model training method and electronic device
CN111242273A (en) * 2018-11-29 2020-06-05 华为终端有限公司 Neural network model training method and electronic equipment
CN110008880A (en) * 2019-03-27 2019-07-12 深圳前海微众银行股份有限公司 A kind of model compression method and device
CN110008880B (en) * 2019-03-27 2023-09-29 深圳前海微众银行股份有限公司 Model compression method and device
CN112020724A (en) * 2019-04-01 2020-12-01 谷歌有限责任公司 Learning compressible features
US12033077B2 (en) 2019-04-01 2024-07-09 Google Llc Learning compressible features
EP3935578A4 (en) * 2019-05-16 2022-06-01 Samsung Electronics Co., Ltd. Neural network model apparatus and compressing method of neural network model
US11657284B2 (en) 2019-05-16 2023-05-23 Samsung Electronics Co., Ltd. Neural network model apparatus and compressing method of neural network model
WO2020231049A1 (en) 2019-05-16 2020-11-19 Samsung Electronics Co., Ltd. Neural network model apparatus and compressing method of neural network model
CN110211121B (en) * 2019-06-10 2021-07-16 北京百度网讯科技有限公司 Method and device for pushing model
CN110211121A (en) * 2019-06-10 2019-09-06 北京百度网讯科技有限公司 Method and apparatus for pushing model
WO2022027937A1 (en) * 2020-08-06 2022-02-10 苏州浪潮智能科技有限公司 Neural network compression method, apparatus and device, and storage medium
US12045729B2 (en) 2020-08-06 2024-07-23 Inspur Suzhou Intelligent Technology Co., Ltd. Neural network compression method, apparatus and device, and storage medium
WO2022062828A1 (en) * 2020-09-23 2022-03-31 深圳云天励飞技术股份有限公司 Image model training method, image processing method, chip, device and medium
CN112288032B (en) * 2020-11-18 2022-01-14 上海依图网络科技有限公司 Method and device for quantitative model training based on generation of confrontation network
CN112288032A (en) * 2020-11-18 2021-01-29 上海依图网络科技有限公司 Method and device for quantitative model training based on generation of confrontation network
CN113505774B (en) * 2021-07-14 2023-11-10 众淼创新科技(青岛)股份有限公司 Policy identification model size compression method
CN113505774A (en) * 2021-07-14 2021-10-15 青岛全掌柜科技有限公司 Novel policy identification model size compression method
CN115526266B (en) * 2022-10-18 2023-08-29 支付宝(杭州)信息技术有限公司 Model Training Method and Device, Service Prediction Method and Device
CN115526266A (en) * 2022-10-18 2022-12-27 支付宝(杭州)信息技术有限公司 Model training method and device, and business prediction method and device
CN118015343A (en) * 2024-01-18 2024-05-10 中移信息系统集成有限公司 Image filtering method and device and electronic equipment

Also Published As

Publication number Publication date
CN108510083B (en) 2021-05-14

Similar Documents

Publication Publication Date Title
CN108510083A (en) A kind of neural network model compression method and device
Qin et al. Forward and backward information retention for accurate binary neural networks
WO2021164625A1 (en) Method of training an image classification model
Feng et al. Evolutionary fuzzy particle swarm optimization vector quantization learning scheme in image compression
Lazebnik et al. Supervised learning of quantizer codebooks by information loss minimization
CN108108751B (en) Scene recognition method based on convolution multi-feature and deep random forest
CN113707235A (en) Method, device and equipment for predicting properties of small drug molecules based on self-supervision learning
US20200257970A1 (en) Data processing apparatus by learning of neural network, data processing method by learning of neural network, and recording medium recording the data processing method
WO2022179533A1 (en) Quantum convolution operator
Carreira-Perpinán et al. Model compression as constrained optimization, with application to neural nets. Part II: Quantization
CN107251059A (en) Sparse reasoning module for deep learning
CN110728295B (en) Semi-supervised landform classification model training and landform graph construction method
CN102331992A (en) Distributed decision tree training
CN110969086A (en) Handwritten image recognition method based on multi-scale CNN (CNN) features and quantum flora optimization KELM
CN112115967A (en) Image increment learning method based on data protection
CN108647571A (en) Video actions disaggregated model training method, device and video actions sorting technique
CN108334910A (en) A kind of event detection model training method and event detecting method
CN113764034A (en) Method, device, equipment and medium for predicting potential BGC in genome sequence
Sepahvand et al. An adaptive teacher–student learning algorithm with decomposed knowledge distillation for on-edge intelligence
CN116629123A (en) Pairing-based single-cell multi-group data integration method and system
JP7427011B2 (en) Responding to cognitive queries from sensor input signals
KR102240882B1 (en) Apparatus, method for generating classifier and classifying apparatus generated thereby
US20230020112A1 (en) Relating complex data
Tang et al. Bringing giant neural networks down to earth with unlabeled data
CN104933438A (en) Image clustering method based on self-coding neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100070, No. 101-8, building 1, 31, zone 188, South Fourth Ring Road, Beijing, Fengtai District

Applicant after: Guoxin Youyi Data Co., Ltd

Address before: 100070, No. 188, building 31, headquarters square, South Fourth Ring Road West, Fengtai District, Beijing

Applicant before: SIC YOUE DATA Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant