Nothing Special   »   [go: up one dir, main page]

CN115438784A - Sufficient training method for hybrid bit width hyper-network - Google Patents

Sufficient training method for hybrid bit width hyper-network Download PDF

Info

Publication number
CN115438784A
CN115438784A CN202210965207.7A CN202210965207A CN115438784A CN 115438784 A CN115438784 A CN 115438784A CN 202210965207 A CN202210965207 A CN 202210965207A CN 115438784 A CN115438784 A CN 115438784A
Authority
CN
China
Prior art keywords
network
bit width
training
layer
hyper
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210965207.7A
Other languages
Chinese (zh)
Inventor
王玉峰
张泽豪
方双康
丁文锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202210965207.7A priority Critical patent/CN115438784A/en
Publication of CN115438784A publication Critical patent/CN115438784A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses a sufficient training method for a hybrid bit width hyper-network, and belongs to the field of machine learning. The method comprises the following specific steps: firstly, for a search space containing a specific bit width, respectively training a single-precision network under each bit width until convergence, and calculating and recording quantization errors of each layer of the network under different bit widths to form the quantization error of the single-precision network. Then, constructing a mixed bit width hyper-network containing all bit widths for the search space, and calculating the quantization error of each layer of the hyper-network under each bit width in each round of training; and further adjusting the sampling probability of each bit width in the super network in the next round according to the comparison of the quantization error of each bit width of each layer with the quantization error of the single-precision network. And finally, searching the hyper-network by adopting a reinforcement learning algorithm to obtain the optimal bit width configuration. The invention can accurately evaluate each subnet during searching, and effectively improves the accuracy of the subnets and the performance of the optimal solution obtained by searching.

Description

Sufficient training method for hybrid bit width hyper-network
Technical Field
The invention belongs to the field of machine learning, and particularly relates to a sufficient training method for a mixed bit width hyper-network.
Background
In recent decade, deep learning has been paid more and more attention by researchers due to the great advantages of feature extraction and model construction compared with shallow models, and has been rapidly developed in the fields of computer vision, character recognition and the like.
Deep learning takes a deep Neural Network as a main presentation form, and a Convolutional Neural Network (CNN) is one of pioneering researches among them due to the inspiration of biological neurology. Compared with the traditional method, the convolutional neural network has the characteristics of weight sharing, local connection, pooling operation and the like, so that global optimization training parameters can be effectively reduced, the complexity of the model is reduced, and the network model has certain invariance to input zooming, translation and distortion. Under the advantage of the characteristic, the convolutional neural network has excellent performance in many computer vision tasks including image classification, target detection and recognition and semantic segmentation.
Although convolutional neural networks exhibit reliable effects in many visual tasks, the huge storage and computation overhead limits the application of convolutional neural networks to the widely popular portable devices at present, and in order to expand the application of convolutional neural networks, the compression and acceleration of models become hot problems in the field of computer vision.
The compression methods for the convolutional neural network at present are mainly divided into three categories:
one is a network pruning method. The basic idea of the method is as follows: the convolutional neural network with better performance usually has a more complex structure, but some parameters of the convolutional neural network do not contribute much to the final output result and appear redundant, so that an effective convolutional kernel channel importance judgment means can be found for the existing convolutional neural network, corresponding redundant convolutional kernel parameters are cut off, and the efficiency of the neural network is improved. In this method, the evaluation means has a very important influence on the model performance.
The second is a neural network structure searching method. The method can enable a machine to automatically search out a network with high speed and high precision through a good search algorithm in a search space in a certain range, and achieves the purpose of network compression. The key of the method is to establish a huge network architecture space, explore the space through an effective network search algorithm, and search an optimal convolutional neural network architecture under a specific combination of training data and computational constraints (such as network size and delay).
And thirdly, a network quantization method. The method quantizes the weight parameters of the 32-bit full-precision network into lower-bit parameters (such as 8-bit,4-bit,1-bit and the like) so as to obtain the low-bit network. The method can effectively reduce parameter redundancy, thereby reducing storage occupation, communication bandwidth and calculation complexity, and facilitating application of a deep network in light-weight application scenes such as artificial intelligence chips.
In the network quantization method, mixed bit width quantization is used for researching the quantization sensitivity of each layer of the network, and each layer in the network is allocated with the most appropriate bit width according to the quantization sensitivity. The super network search method is one of important methods for quantifying the mixed bit width, firstly a search space containing a plurality of bit widths is set, a super network containing all selectable bit widths is constructed, and then training of the mixed bit width super network is realized by sampling subnets with different bit widths in the training process. The trained hyper-network can be searched by algorithms such as evolutionary learning and the like, so that a mixed bit width sub-network with the highest testing accuracy of the network under the constraint of limited resources is obtained. However, in the training process of the hybrid bit-wide hyper-network, due to insufficient training of the sub-networks, the accuracy of the sub-networks is often degraded, the accuracy of evaluation of each sub-network in the search process is further affected, and finally the problem that the search is trapped in a suboptimal solution is caused.
How to solve the problem of precision degradation caused by insufficient sub-network training in the super-network training and improve the accuracy of the sub-network becomes a problem to be deeply researched at present.
Disclosure of Invention
The invention provides a full training method for a hybrid bit width super network, which takes the problem of precision degradation caused by insufficient sub-network training in the research of the current super network search method into consideration. The method comprises the steps of firstly calculating and recording quantization error information of a single-precision network, and utilizing the information to adjust bit width sampling probability in mixed bit width super network training so as to guide the training process of the super network, so that subnets under each bit width are trained more fully, and further the training efficiency and the final performance of the network are improved.
The full training method for the mixed bit width ultra-network specifically comprises the following steps:
step one, for a search space containing n bit widths, respectively training single-precision networks with each bit width to form respective quantization networks;
for a certain network structure arch, the search space containing n bits wide is B [ B ] 1 ,b 2 ,...,b n B, respectively performing bit width on the input and weight of each convolution layer in the network structure arch 1 To b n Forming a quantized network arch 1 ,arch 2 ,...,arch n
Training each quantization network until the network converges, and calculating the quantization error of each layer in the quantization networks corresponding to different bit widths to form the quantization error Q of the single-precision network;
for quantized network arch n Quantization error value q of the l-th layer ln The calculation method is as follows:
Figure BDA0003794388590000021
wherein x is ln Representing quantized network arch n Full precision data in layer I, x lnq Representation of full precision data x ln Making bit width b n The quantized data of (a); m represents the number of data in the current layer, the total number of layers of the quantization network is L, Q = { Q = { (Q) } l } l=1:L Representing the quantization error of a single precision network, where q l =[q l1 ,q l2 …q ln ]。
And step three, inserting quantizers containing candidate bit widths into each convolution layer of the network structure to construct a mixed bit width super network.
The specific insertion process is as follows:
an input quantizer is inserted into the input end of each convolution layer, a weight quantizer is inserted into each convolution core part, and each quantizer has n candidate bit widths corresponding to the bit width in the search space B. Sampling each bit width in the training process, and after sampling, converting the bit width of each quantizer to the sampled bit width to complete the forward propagation process of the input data.
Step four, before each round of training the mixed bit width super network begins, the quantization error of the current super network under each bit width is calculated to form the quantization error of the super network
Figure BDA0003794388590000022
The specific operation mode is as follows:
first, all quantizer bit widths of the hyper-network are converted to b 1 In the above, the quantization errors of all layers at this time are calculated
Figure BDA0003794388590000031
The specific calculation method is as follows:
Figure BDA0003794388590000032
wherein,
Figure BDA0003794388590000033
representing full-precision data in the l-th layer in the hyper-network;
Figure BDA0003794388590000034
representation of full precision data
Figure BDA0003794388590000035
Bit width of b 1 The quantized data of (1).
Then, the quantizer bit width of the hyper-network is converted into b in turn 2 ,b 3 ,...,b n Respectively calculating the quantization error of the hyper-network under the configuration
Figure BDA0003794388590000036
Thereby, a quantization error matrix of the hyper-network is obtained
Figure BDA0003794388590000037
Wherein the quantization error of each layer can be expressed as:
Figure BDA0003794388590000038
quantization error values for the layer n candidate bit widths are included.
Step five, according to the quantization error of the hyper network
Figure BDA0003794388590000039
Dynamically adjusting the sampling probability of the bit width of each layer of the super network in the training round according to the difference between the quantization error Q of the single-precision network and the quantization error Q of the single-precision network;
when the difference value of the nth candidate bit width of the l layer of the hyper-network is obtained, the calculation mode is as follows:
Figure BDA00037943885900000310
if Δ ln A positive number indicates a bit width b n Then, the quantization error of the l layer of the hyper-network is larger than that of the single-precision network, and the layer is not sufficiently trained under the bit width and needs further training; and Δ ln And if the number is a negative number, the training error is small enough under the bit width of the layer, and the number of sampling times of the bit width can be reduced in the subsequent training.
Thus after the error difference is calculated, the error is calculated for delta ln The positive part is reserved, and the negative value is set to 0, namely:
Figure BDA00037943885900000311
then, normalizing the quantization error difference value between bit widths of each layer of convolution to obtain the sampling probability of each bit width of the current round:
Figure BDA00037943885900000312
thus, the sampling probability of each bit width of the super network in the current training round is obtained. Delta of ln The larger the value is, the lower the training fullness of the current bit width is, and thus the sampling probability p of the current round to the training fullness is ln The larger the bit width is, the easier the network under the current bit width is to be trained, so that the quantization error of the network under the bit width is reduced, and the quantization accuracy of the network is improved.
And step six, after the training of the super network is finished, searching on the super network, and finding out the optimal candidate network which meets the balance between the precision and the calculated amount from all the candidate networks.
The invention has the advantages that:
(1) A full training method for a hybrid bit width super network provides an improved direction for a super network search algorithm in consideration of the problem of insufficient training.
(2) A sufficient training method for a hybrid bit width super network utilizes quantization errors to adjust bit width sampling probability in super network training in a targeted mode, and accuracy of a sub network is effectively improved.
(3) A full training method for a hybrid bit width super network improves the performance of each sub-network in the super network, enables each sub-network in the super network to be accurately evaluated during searching, and can effectively improve the performance of an optimal solution obtained through searching.
Drawings
FIG. 1 is a flow chart of a method of full training for a hybrid bit-wide hyper-network of the present invention;
FIG. 2 is a diagram illustrating a quantizer insertion method according to the present invention.
Detailed Description
The following describes in detail a specific embodiment of the present invention with reference to the drawings.
The invention discloses a full training method for a mixed bit width super network, which is characterized in that for a search space containing a specific bit width, single-precision networks of the search space under each bit width are respectively trained until the convergence of the networks, and the quantization errors of each layer of the networks under different bit widths are calculated and recorded; then, constructing a mixed bit width super network containing all bit widths for the search space; in each round of training the super network, calculating the quantization error of each layer of the super network under each bit width; according to the comparison condition of the quantization error under each bit width and the quantization error of the single-precision network, adjusting the sampling probability of each bit width in the mixed bit width super network of the next round; and finally, after the mixed bit width super network training is finished, searching the super network by adopting an enhanced learning algorithm or an evolutionary learning algorithm to obtain the optimal bit width configuration. The invention utilizes the quantization error to adjust the bit width sampling probability in the training of the super network in a targeted manner, effectively improves the precision of the sub-network, enables the search algorithm to accurately evaluate each sub-network in the super network, and finally enables the search result to approach the optimal solution.
As shown in fig. 1, the specific method comprises the following steps:
the full training method for the mixed bit width ultra-network specifically comprises the following steps:
step one, respectively training single-precision networks with each bit width for a search space with n bit widths to form respective quantization networks;
for a certain network fabric arch, the search space containing n bits wide is B { B 1 ,b 2 ,...,b n B, performing bit width on the input and weight of each convolution layer in the network structure arch to obtain b 1 To b n Forming a quantized network arch 1 ,arch 2 ,...,arch n
Training each quantization network until the network converges, and calculating the quantization error of each layer in the quantization networks corresponding to different bit widths to form the quantization error Q of the single-precision network;
training each quantization network until the network converges, and respectively calculating quantization errors Q = { Q ] of each layer of the quantization network after training l } l=1:L Denotes the quantization error of a single precision network, where L denotes the total number of layers of the quantized network, q l =[q l1 ,q l2 …q ln ]Representing quantization errors of each net of the l layer;
for quantized network arch n Quantization error value q of the l-th layer ln The calculation method is as follows:
Figure BDA0003794388590000041
wherein x is ln Representing quantized network arch n Full precision data in layer l, such as weights and inputs; x is the number of lnq Representation of full precision data x ln Making bit width b n M represents the number of data in the current layer.
And step three, inserting a quantizer containing the candidate bit width into each convolution layer of the network structure to construct a mixed bit width super network.
The specific insertion process is as follows:
the quantizer is inserted in the manner shown in fig. 2, where an input quantizer is inserted into an input end of each convolution layer of the network structure arch, and a weight quantizer is inserted into each convolution core, where each quantizer has n candidate bit widths, and corresponds to the bit width in the search space B. And sampling each bit width with a certain probability in the training process, and converting the bit width of each quantizer to the sampled bit width after sampling is finished so as to finish the forward propagation process of the input data.
Step four, before each round of training the mixed bit width super network begins, the quantization error of the current super network under each bit width is calculated to form the quantization error of the super network
Figure BDA0003794388590000051
The specific operation mode is as follows:
first, all quantizer bit widths of the super network are converted to b 1 In the above, the quantization errors of all layers at this time are calculated
Figure BDA0003794388590000052
The specific calculation method is as follows:
Figure BDA0003794388590000053
wherein
Figure BDA0003794388590000054
Representing full-precision data in layer l in the super network, such as weights and inputs;
Figure BDA0003794388590000055
representing data to full precision
Figure BDA0003794388590000056
Wall row bit width b 1 The quantized data of (1).
Then, the quantizer bit width of the hyper-network is converted into b in turn 2 ,b 3 ,...,b n Respectively calculating the quantization error of the hyper-network under the configuration
Figure BDA0003794388590000057
Thereby, a quantization error matrix of the hyper-network is obtained
Figure BDA0003794388590000058
Wherein the quantization error of each layer can be expressed as:
Figure BDA0003794388590000059
quantization error values for the layer n candidate bit widths are included.
Step five, according to the quantization error of the hyper-network
Figure BDA00037943885900000510
Dynamically adjusting the sampling probability of the bit width of each layer of the super network in the training round according to the difference between the quantization error Q of the single-precision network and the quantization error Q of the single-precision network;
first, calculate
Figure BDA00037943885900000511
The difference with Q is normalized and calculated as follows:
Figure BDA00037943885900000512
note that the above operation is an element-level operation, for example, when a difference value of nth candidate bit widths of an l-th layer of the super network is obtained, the calculation method is as follows:
Figure BDA00037943885900000513
in Δ, if Δ ln A positive number indicates a bit width b n Then, the quantization error of the l layer of the hyper-network is larger than that of the single-precision network, and the layer is not sufficiently trained under the bit width and needs further training; and Δ ln And a negative number, this means that the training error for the bit width at this layer is small enough, and the number of sampling for the bit width can be reduced in the following training.
Thus after the error difference is calculated, the error is calculated for delta ln The positive part is reserved, and the negative value is set to 0, which is expressed as:
Figure BDA0003794388590000061
then, normalizing the quantization error difference value between bit widths of each layer of convolution to obtain the sampling probability of each bit width of the current round:
Figure BDA0003794388590000062
thus, the sampling probability of each bit width of the hyper-network in the current training round is obtained. Delta ln The larger the value, the lower the training fullness of the current bit width, and thus the current roundSampling probability p for it ln The larger the bit width is, the easier the network under the current bit width is to be trained, so that the quantization error of the network under the bit width is reduced, and the quantization accuracy of the network is improved.
And step six, after the training of the super network is finished, searching on the super network, and finding out the optimal candidate network which meets the balance between the precision and the calculated amount from all the candidate networks.
This step may be implemented using an evolutionary learning search algorithm. Firstly, P candidate networks are randomly sampled from the super network to serve as an initial population, each candidate network has completely different bit width codes, and the performance of the candidate networks is evaluated, recorded and sequenced after sampling is completed each time. Then, the network with better performance is used as a parent, and the crossing and mutation operations are carried out on the network to generate a new child. And after the evolution search reaches a certain round, selecting a candidate network with the optimal performance as a search result, wherein the candidate network is the optimal mixed bit width network obtained by final search.

Claims (4)

1. A full training method for a hybrid bit width hyper-network is characterized by comprising the following specific steps:
firstly, respectively training single-precision networks with each bit width for search spaces with n bit widths to form respective quantization networks; training each quantization network until the network converges, and calculating the quantization errors of each layer in the quantization networks corresponding to different bit widths to form the quantization error Q of the single-precision network;
for quantized network arch n Quantization error value q of the l-th layer ln The calculation method is as follows:
Figure FDA0003794388580000011
wherein x is ln Representing quantized network arch n Full precision data in layer I, x lnq Representation of full precision data x ln Making bit width b n The quantized data of (a); m represents the number of data in the current layer, quantizedThe total number of layers of the network is L, Q = { Q = { Q = l } l=1:L Representing the quantization error of a single precision network, where q l =[q l1 ,q l2 …q ln ];
Then, a quantizer with candidate bit width is inserted into each convolution layer of the network structure to construct a mixed bit width super network; before each round of training the mixed bit width super network begins, the quantization error of the current super network under each bit width is calculated to form the quantization error of the super network
Figure FDA0003794388580000012
The specific operation mode is as follows:
first, all quantizer bit widths of the super network are converted to b 1 In the above, the quantization errors of all layers at this time are calculated
Figure FDA0003794388580000013
The specific calculation method is as follows:
Figure FDA0003794388580000014
wherein,
Figure FDA0003794388580000015
representing full-precision data in the l-th layer in the hyper-network;
Figure FDA0003794388580000016
representation of full precision data
Figure FDA0003794388580000017
Making bit width b 1 The quantized data of (a);
secondly, the quantizer bit width of the hyper-network is converted into b in sequence 2 ,b 3 ,...,b n Respectively calculating the quantization error of the hyper-network under the configuration
Figure FDA0003794388580000018
Thereby, a quantization error matrix of the hyper-network is obtained
Figure FDA0003794388580000019
Wherein the quantization error of each layer can be expressed as:
Figure FDA00037943885800000110
the quantization error values under the n candidate bit widths of the layer are included;
then, the quantization error according to the hyper network
Figure FDA00037943885800000111
Dynamically adjusting the sampling probability of the bit width of each layer of the super network in the training round according to the difference between the quantization error Q of the single-precision network and the quantization error Q of the single-precision network;
and finally, after the training of the super network is finished, searching on the super network, and finding out the optimal candidate network meeting the balance between precision and calculated amount from all candidate networks.
2. The method of claim 1, wherein the search space for a certain network fabric arch with n bit widths is B { B } B 1 ,b 2 ,...,b n B, respectively performing bit width on the input and weight of each convolution layer in the network structure arch 1 To b n Forming a quantized network arch 1 ,arch 2 ,...,arch n
3. The method for fully training the hybrid bit-wide hyper-network according to claim 1, wherein the hybrid bit-wide hyper-network is constructed by the following specific insertion process:
respectively inserting an input quantizer at the input end of each convolution layer, respectively inserting a weight quantizer at each convolution core part, wherein each quantizer has n candidate bit widths and corresponds to the bit width in a search space; sampling each bit width in the training process, and after sampling, converting the bit width of each quantizer to the sampled bit width to complete the forward propagation process of the input data.
4. The method for fully training the hybrid bit-width hyper-network according to claim 1, wherein the specific process for dynamically adjusting the sampling probability is as follows:
firstly, when the difference value of the nth candidate bit width of the l layer of the hyper network is obtained, the calculation mode is as follows:
Figure FDA0003794388580000021
if Δ ln A positive number indicates a bit width b n Then, the quantization error of the l layer of the hyper-network is larger than that of the single-precision network, and the layer is not sufficiently trained under the bit width and needs further training; and Δ ln If the bit width is a negative number, the training error of the layer under the bit width is small enough, and the sampling times of the bit width can be reduced in the following training;
then, after the error difference is calculated, the delta is calculated ln The positive part is reserved, and the negative value is set to 0, which is expressed as:
Figure FDA0003794388580000022
finally, normalizing the quantization error difference value between the bit widths of each layer of convolution to obtain the sampling probability of each bit width of the current round:
Figure FDA0003794388580000023
thus, the sampling probability, delta, of each bit width of the hyper-network in the current training round is obtained ln The larger the value is, the lower the training fullness of the current bit width is, and thus the sampling probability p of the current round to the training fullness is ln The larger the bit width is, the easier the network under the current bit width is to be trained, so that the quantization error of the network under the bit width is reduced, and the quantization accuracy of the network is improved.
CN202210965207.7A 2022-08-12 2022-08-12 Sufficient training method for hybrid bit width hyper-network Pending CN115438784A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210965207.7A CN115438784A (en) 2022-08-12 2022-08-12 Sufficient training method for hybrid bit width hyper-network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210965207.7A CN115438784A (en) 2022-08-12 2022-08-12 Sufficient training method for hybrid bit width hyper-network

Publications (1)

Publication Number Publication Date
CN115438784A true CN115438784A (en) 2022-12-06

Family

ID=84241788

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210965207.7A Pending CN115438784A (en) 2022-08-12 2022-08-12 Sufficient training method for hybrid bit width hyper-network

Country Status (1)

Country Link
CN (1) CN115438784A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117709409A (en) * 2023-05-09 2024-03-15 荣耀终端有限公司 Neural network training method applied to image processing and related equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117709409A (en) * 2023-05-09 2024-03-15 荣耀终端有限公司 Neural network training method applied to image processing and related equipment

Similar Documents

Publication Publication Date Title
CN114937151B (en) Lightweight target detection method based on multiple receptive fields and attention feature pyramid
WO2022027937A1 (en) Neural network compression method, apparatus and device, and storage medium
CN111275172B (en) Feedforward neural network structure searching method based on search space optimization
CN111242180B (en) Image identification method and system based on lightweight convolutional neural network
CN112465120A (en) Fast attention neural network architecture searching method based on evolution method
CN111898689A (en) Image classification method based on neural network architecture search
WO2022252455A1 (en) Methods and systems for training graph neural network using supervised contrastive learning
CN113269312B (en) Model compression method and system combining quantization and pruning search
CN114118369B (en) Image classification convolutional neural network design method based on group intelligent optimization
CN115659807A (en) Method for predicting talent performance based on Bayesian optimization model fusion algorithm
CN114528987A (en) Neural network edge-cloud collaborative computing segmentation deployment method
CN116362325A (en) Electric power image recognition model lightweight application method based on model compression
CN115438784A (en) Sufficient training method for hybrid bit width hyper-network
CN117253037A (en) Semantic segmentation model structure searching method, automatic semantic segmentation method and system
CN116976428A (en) Model training method, device, equipment and storage medium
CN113935398B (en) Network traffic classification method and system based on small sample learning in Internet of things environment
Chen et al. DNN gradient lossless compression: Can GenNorm be the answer?
Rui et al. Smart network maintenance in an edge cloud computing environment: An adaptive model compression algorithm based on model pruning and model clustering
CN117194742A (en) Industrial software component recommendation method and system
Peter et al. Resource-efficient dnns for keyword spotting using neural architecture search and quantization
Minu et al. An efficient squirrel search algorithm based vector quantization for image compression in unmanned aerial vehicles
CN116227563A (en) Convolutional neural network compression and acceleration method based on data quantization
CN116010832A (en) Federal clustering method, federal clustering device, central server, federal clustering system and electronic equipment
WO2023082045A1 (en) Neural network architecture search method and apparatus
CN113033653B (en) Edge-cloud cooperative deep neural network model training method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination