CN115906935A - Parallel differentiable neural network architecture searching method - Google Patents
Parallel differentiable neural network architecture searching method Download PDFInfo
- Publication number
- CN115906935A CN115906935A CN202211299553.2A CN202211299553A CN115906935A CN 115906935 A CN115906935 A CN 115906935A CN 202211299553 A CN202211299553 A CN 202211299553A CN 115906935 A CN115906935 A CN 115906935A
- Authority
- CN
- China
- Prior art keywords
- network
- neural network
- units
- unit
- basic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 57
- 238000000034 method Methods 0.000 title claims abstract description 34
- 238000012549 training Methods 0.000 claims description 20
- 125000002015 acyclic group Chemical group 0.000 claims description 7
- 238000005070 sampling Methods 0.000 claims description 7
- 238000005457 optimization Methods 0.000 claims description 6
- 238000012795 verification Methods 0.000 claims description 4
- 230000003213 activating effect Effects 0.000 claims description 3
- 230000006870 function Effects 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000011160 research Methods 0.000 description 3
- 238000010606 normalization Methods 0.000 description 2
- 230000002787 reinforcement Effects 0.000 description 2
- 101100153586 Caenorhabditis elegans top-1 gene Proteins 0.000 description 1
- 101100370075 Mus musculus Top1 gene Proteins 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Image Analysis (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a parallel differentiable neural network architecture searching method, which comprises the steps of firstly constructing a dual-path super network with a binary gate; then, carrying out search space serialization by using a sigmoid function; then, optimizing the super network by using a gradient descending mode to obtain an optimal basic unit comprising a common unit and a reduction unit; and finally, stacking the obtained basic units to obtain a required deep neural network, and retraining the deep neural network until the network converges. By designing a rapid and parallel differentiable neural network architecture searching method, the speed and the performance of neural network architecture searching are obviously improved.
Description
Technical Field
The invention belongs to the technical field of deep learning, and particularly relates to a parallel differentiable neural network architecture searching method.
Background
The rapid development of deep learning proves the dominant position of the deep learning in the fields of artificial intelligence and deep learning. Due to the diligent efforts of researchers, the performance of deep neural networks is constantly increasing. However, since the manual design of the neural network requires a continuous trial and error process and relies heavily on the design experience of experts, it is time-consuming and resource-consuming to manually create the neural network structure. To reduce manpower and cost, neural Network Architecture Search (NAS) techniques are proposed. The NAS is a technology for automatically searching a neural network architecture by means of an algorithm to meet requirements of different tasks, and becomes a research hotspot in the field of automatic machine learning.
The core of the NAS method is to construct a huge search space, then an efficient search algorithm is adopted to mine the space, and the optimal architecture is found under a series of training data and constraint conditions. Early work was primarily based on reinforcement learning and evolutionary algorithms. They have shown great potential in finding high performance neural network architectures. However, the neural network architecture search method based on reinforcement learning and evolutionary algorithm usually bears heavy computational burden, which seriously hinders the wide application and research of NAS. To reduce the heavy computational burden, weight sharing algorithms have been proposed that formulate the search space as an over-parameterized super-network and evaluate the sampled architecture without additional optimization. By sharing the weights, NAS speeds up by several orders of magnitude.
One particular type of weight sharing method is the micro neural network architecture search technique proposed in the document "certificates: differentiated architecture search". The technology firstly defines a search space as a super network stacked by basic units (a common unit and a reduction unit), and finds the optimal architecture of a neural network by searching the basic units. DARTS then converts the discrete operations into a way of weighting a fixed set of operations, so the super-network can be trained by a gradient-based two-layer optimization method. This makes it more potential for NAS to explore optimal network architectures from a large architectural search space. Nevertheless, the prior art has certain limitations: still need bear huge network architecture and the huge amount of computation cost that redundant space brought, therefore the extensive application and the research of NAS have been restricted.
Disclosure of Invention
In order to overcome the defects of the prior art, the invention provides a parallel differentiable neural network architecture searching method, which comprises the steps of firstly constructing a dual-path super network with a binary gate; then, carrying out search space serialization by using a sigmoid function; then, optimizing the super network by using a gradient descending mode to obtain an optimal basic unit comprising a common unit and a reduction unit; and finally, stacking the obtained basic units to obtain a required deep neural network, and retraining the deep neural network until the network converges. By designing a rapid and parallel differentiable neural network architecture searching method, the speed and the performance of neural network architecture searching are obviously improved.
The technical scheme adopted by the invention for solving the technical problem comprises the following steps:
step 1: constructing a dual-path super network with binary gates;
the super network is formed by stacking L basic units;
the basic unit comprises a common unit and a reduction unit; the common unit and the reduction unit are directed acyclic graphs formed by 7 nodes, wherein the directed acyclic graphs comprise 2 input nodes, 4 middle nodes and 1 output node, the connection between the nodes represents different operations, and the connection relationship between the nodes in the common unit and the reduction unit is different;
step 1-1: setting the operation pool as O, wherein the operation pool O comprises 8 basic operation operators which are respectively: sep-conv-3X 3, sep-conv-5X 5, dil-conv-3X 3, dil-conv-5X 5, max-pool-3X 3, avg-pool-3X 3, skip-connection and none;
the operation pool O is decomposed into two operator subsets O by random sampling 1 And O 2 In which O is 1 And O 2 Satisfy | O 1 |=|O 2 |,|O 1 |+|O 2 I = O andO 1 and O 2 Respectively used for constructing two sub-networks;
two groups of channels are sampled from input channels of the whole network and are respectively adopted by two sub-networks, and the two sub-networks are finally combined into one sub-network through addition operation; for two different nodes x in a basic unit of a super network i To x j The information dissemination, described as:
wherein x is i And x j Represent different nodes, and 0 ≦ i<j≤5,And &>Each represents O 1 And O 2 Weights of different operations;And &>Are two sets of channel sample masks, the masks consisting of only 0 and 1;And &>Respectively representing selected and unselected channels;And &>Two groups of selected channels are adopted by two operation operator subsets simultaneously;
the super network covers all the frameworks in the form of two parallel paths;
step 1-2: in the process of training the super network, selectively activating each path to participate in training by using binary gating;
for two nodes x in a basic unit i To x j The binary gating of the super network is described as:
wherein the value of gate1 and gate2 is 0 or 1, excluding the situation that gate1 and gate2 are 0 at the same time; binary gating operation is carried out in a random sampling mode to selectively activate corresponding paths to participate in training;
step 2: utilizing a sigmoid function to carry out search space serialization and redefining two sub-networks;
where δ (·) represents a sigmoid function, which is calculated as follows:
and step 3: optimizing the super network by using a gradient descending mode to obtain an optimal basic unit comprising a common unit and a reduction unit;
finding the optimal alpha by jointly optimizing the network parameter w and the structure parameter alpha to determine the optimal basic unit:
s.t.w * (α)=arg min L train (w,α)
wherein L is train For training loss, L val In order to verify the loss, cross entropy loss is adopted for both training loss and verification loss;
after obtaining the structural parameter α, according to a one-hot encoding:
selecting two operations with the maximum alpha value as the input of the intermediate node of the basic unit;
and 4, step 4: and (3) stacking the basic units obtained in the step (2) to obtain a required deep neural network, and retraining the deep neural network until the network converges.
Preferably, the super network optimization process in step 3 adopts a network formed by stacking 6 common units and 2 reduction units, wherein the 2 reduction units are respectively located at 1/3 and 2/3 of the total depth of the network.
Preferably, the deep neural network required in step 4 is a deep neural network for CIFAR-10, and is formed by stacking 20 basic units, wherein each basic unit comprises 2 reduction units and 18 common units.
Preferably, the deep neural network required in step 4 is a deep neural network for ImageNet, and is formed by stacking 12 basic units, wherein each basic unit comprises 2 reduction units and 12 common units, and the 2 reduction units are respectively located at 1/3 and 2/3 of the total depth of the network.
The invention has the following beneficial effects:
the invention provides a rapid and parallel differentiable neural network architecture searching technology, which reduces the memory consumption in the training process and improves the neural network architecture searching speed by constructing a dual-path super network with a binary gate. At the same time, considering that softmax is used to select the best input for the intermediate nodes of the two operator subsets, unfair problems may be encountered. In order to solve the problem, a sigmoid function is introduced, and the performance of each operation operator is measured under the condition of no normalization, so that the performance of the neural network architecture search technology is ensured. The invention obviously improves the speed and the performance of the neural network architecture search.
Drawings
Fig. 1 is a diagram of implementation steps of a fast parallel search method for a differentiable neural network architecture according to the present invention.
Fig. 2 is a search method of the present invention, taking a basic unit as an example.
FIG. 3 is a structural diagram of a basic unit searched on CIFAR-10 according to the present invention: wherein (a) is a normal unit and (b) is a reduction unit.
FIG. 4 is a diagram of the structure of the basic unit searched on ImageNet in the present invention: wherein (a) is a normal unit and (b) is a reduction unit.
Detailed Description
The invention is further illustrated with reference to the following figures and examples.
A parallel differentiable neural network architecture searching method comprises the following steps:
step 1: constructing a dual-path super network with binary gates;
the super network is formed by stacking L basic units;
the basic unit comprises a common unit and a reduction unit; the common unit and the reduction unit are directed acyclic graphs formed by 7 nodes, wherein the directed acyclic graphs comprise 2 input nodes, 4 middle nodes and 1 output node, the connection between the nodes represents different operations, and the connection relationship between the nodes in the common unit and the reduction unit is different;
step 1-1: setting the operation pool as O, wherein the operation pool O comprises 8 basic operation operators which are respectively: sep-conv-3X 3, sep-conv-5X 5, dil-conv-3X 3, dil-conv-5X 5, max-pool-3X 3, avg-pool-3X 3, skip-connection and none;
the operation pool O is decomposed into two operator subsets O by random sampling 1 And O 2 In which O is 1 And O 2 Satisfy | O 1 |=|O 2 |,|O 1 |+|O 2 I = O andO 1 and O 2 Respectively used for constructing two sub-networks;
two groups of channels are sampled from input channels of the whole network and are respectively adopted by two sub-networks, and the two sub-networks are finally combined into one sub-network through addition operation; for two different nodes x in a basic unit of a super network i To x j The information dissemination of (c), described as:
wherein x is i And x j Represent different nodes, and 0 ≦ i<j≤5,And &>Each represents O 1 And O 2 Weights of different operations in;And &>Are two sets of channel sample masks, the masks consisting of only 0 and 1;And &>Respectively representing selected and unselected channels;And &>Two groups of selected channels are adopted by two operation operator subsets simultaneously;
the super network covers all the frameworks in the form of two parallel paths;
step 1-2: in the process of training the super network, selectively activating each path to participate in training by using binary gating;
for two nodes x in a basic unit i To x j The binary gating of the super network is described as:
wherein the value of gate1 and gate2 is 0 or 1, excluding the situation that gate1 and gate2 are 0 at the same time; binary gating operation is carried out in a random sampling mode to selectively activate corresponding paths to participate in training;
step 2: utilizing a sigmoid function to carry out search space serialization and redefining two sub-networks;
where δ (·) represents a sigmoid function, which is calculated as follows:
the super network optimization process adopts a network formed by stacking 6 common units and 2 reduction units, wherein the 2 reduction units are respectively positioned at 1/3 and 2/3 of the total depth of the network;
and step 3: optimizing the super network by using a gradient descending mode to obtain an optimal basic unit comprising a common unit and a reduction unit;
finding the optimal alpha by jointly optimizing the network parameter w and the structure parameter alpha to determine the optimal basic unit:
s.t.w * (α)=arg min L train (w,α)
wherein L is train To exercise loss, L val In order to verify the loss, cross entropy loss is adopted for both training loss and verification loss;
after obtaining the structural parameter α, according to a one-hot encoding:
selecting two operations with the maximum alpha value as the input of the intermediate node of the basic unit;
and 4, step 4: and (3) stacking the basic units obtained in the step (2) to obtain a required deep neural network, and retraining the deep neural network until the network converges.
The deep neural network for CIFAR-10 is formed by stacking 20 basic units, wherein each basic unit comprises 2 reduction units and 18 common units.
The deep neural network for ImageNet is formed by stacking 12 basic units, wherein each basic unit comprises 2 reduction units and 12 common units, and the 2 reduction units are respectively positioned at 1/3 and 2/3 of the total depth of the network.
The specific embodiment is as follows:
the fast parallel differentiable neural network architecture searching method of the embodiment specifically comprises the following steps:
s1: constructing a super network, wherein the super network is a dual-path super network with binary gates;
s2: utilizing a sigmoid function to carry out search space serialization;
s3: optimizing the super network by using a gradient descending mode to obtain an optimal basic unit (a common unit and a reduction unit);
s4: and (3) stacking the basic units obtained in the step (S2) to obtain a required deep neural network, and retraining the deep neural network until the network converges.
By adopting the technical scheme, the memory consumption of the neural network architecture searching technology is reduced by constructing the super network containing two parallel paths and gating operation, so that the neural network architecture searching speed is increased. Unfair problems may be encountered when using softmax to select the best input for the intermediate node of the two operator subsets. In order to solve the problem, a sigmoid function is introduced to carry out search space serialization, and the sigmoid function is used for measuring the performance of operation under the condition of no normalization. The invention obviously improves the speed and the performance of the neural network architecture search. In step S1, the super network is formed by stacking L basic units. The basic unit comprises a common unit and a reduction unit. The normal unit and the reduction unit are directed acyclic graphs composed of 7 nodes, including 2 input nodes, 4 intermediate nodes, and 1 output node, and the connection between the nodes represents a possible operation (e.g., convolution of 3 × 3). The reduction unit adopts convolution with stride of 2, so that the spatial resolution of the characteristic diagram is reduced to half of the original resolution. In order to increase the search speed, the invention constructs a dual-path super network with binary gates. The specific construction process of the dual-path super network with the binary gate is as follows:
s11: assuming that the whole operation pool is O, the operation pool O contains 8 basic operation operators, which are: sep-conv-3X 3, sep-conv-5X 5, dil-conv-3X 3, dil-conv-5X 5, max-pool-3X 3, avg-pool-3X 3, identity (skip-connection) and none. The operation pool O is decomposed into two smaller operator subsets O 1 And O 2 In which O is 1 And O 2 Satisfy | O 1 |=|O 2 |,|O 1 |+|O 2 I = O andO 1 and O 2 Respectively for constructing two smaller sub-networks. In order to reduce the computational burden and to make the search space cover all possible architectures, the invention designs a dual-path super network with binary gates. First, the present invention employs a partial connection strategy. In particular, two groups of channels are sampled from the whole input channel, which are respectively employed by the two sub-networks. The two sub-networks are combined into one by addition, so that the super-network appears in a parallel fashion, as node x i To node x j For example, the super network may be described as follows:
wherein x is i And x j Represent different nodes, and 0 ≦ i<j≤5。And &>Each represents O 1 And O 2 The weights of the different operations in (1).And &>Are two sets of channel sample masks, the masks consisting of only 0 and 1.And &>Representing selected and unselected channels, respectively.And &>Two selected sets of channels are employed simultaneously by two subsets of operators. This design brings an intuitive advantage that the super network can cover all possible architectures in the form of two parallel paths;
s12: in the process of training the super network, each path is selectively activated to participate in training by using binary gating. With node x i To node x j For example, the super network may be described as:
wherein the values of gate1 and gate2 are 0 or 1. In the actual operation, the case where gate1=0 and gate2=0 is excluded. Binary gating operation is performed in a random sampling mode to selectively activate corresponding paths to participate in training, so that the memory cost is greatly reduced.
In the step S2, the sigmoid function is adopted to perform search space serialization, which is specifically as follows:
where δ (·) represents a sigmoid function, which is calculated as follows:
in step 3, the super network is optimized, and an optimal α is found by jointly optimizing a network parameter w and a structural parameter α to determine an optimal basic unit:
s.t.w * (α)=arg min L train (w,α)
wherein L is train To exercise loss, L val To verify the loss. And cross entropy loss is adopted for both training loss and verification loss. After obtaining the architecture parameter α, according to one-hot encoding:
and selecting the two operations with the maximum alpha value as the input of the middle node of the basic unit.
In the super network optimization process in the step S3, a large network formed by stacking 6 common units and 2 reduction units is adopted, wherein the 2 reduction units are respectively located at 1/3 and 2/3 of the total depth of the network.
In the step S4, the deep neural network for CIFAR-10 is formed by stacking 20 basic units (2 reduction units and 18 common units), and the deep neural network for ImageNet is a large network constructed by stacking 12 basic units (2 reduction units and 12 common units). Wherein 2 reduction units are respectively positioned at 1/3 and 2/3 of the total depth of the network.
And (4) evaluating the deep neural network constructed in the step (S4) on a corresponding data set, and testing the performance of the deep neural network. Wherein, the classification accuracy of 97.47 percent is realized on the CIFAR-10 only by using 0.08GPU day. Compared with the result of reference "Darts" (classification accuracy of 97.24% in 1GPU day), the search rate and the network performance are greatly improved, wherein the search rate is improved by 12.5 times. Meanwhile, the searching speed of the method is high, the method supports the direct searching on ImageNet, and only 2.44GPU days are utilized on ImageNet, so that the Top-1 classification accuracy of 76.1 percent and the Top-5 classification accuracy of 92.8 percent are realized.
Claims (4)
1. A parallel differentiable neural network architecture searching method is characterized by comprising the following steps:
step 1: constructing a dual-path super network with binary gates;
the super network is formed by stacking L basic units;
the basic unit comprises a common unit and a reduction unit; the common unit and the reduction unit are directed acyclic graphs formed by 7 nodes, wherein the directed acyclic graphs comprise 2 input nodes, 4 middle nodes and 1 output node, the connection between the nodes represents different operations, and the connection relationship between the nodes in the common unit and the reduction unit is different;
step 1-1: setting the operation pool as O, wherein the operation pool O comprises 8 basic operation operators which are respectively: sep-conv-3X 3, sep-conv-5X 5, dil-conv-3X 3, dil-conv-5X 5, max-pool-3X 3, avg-pool-3X 3, skip-connection and none;
the operation pool O is decomposed into two operator subsets O by random sampling 1 And O 2 In which O is 1 And O 2 Satisfy | O 1 |=|O 2 |,|O 1 |+|O 2 I = O andO 1 and O 2 Respectively used for constructing two sub-networks;
two groups of channels are sampled from input channels of the whole network and are respectively adopted by two sub-networks, and the two sub-networks are finally combined into one sub-network through addition operation; for two different nodes x in a basic unit of a super network i To x j The information dissemination of (c), described as:
wherein x is i And x j Represent different nodes, and 0 ≦ i<j≤5,And &>Each represents O 1 And O 2 Middle and different exercisesMaking a weight;And &>Are two sets of channel sample masks, the masks consisting of only 0 and 1;And &>Respectively representing selected and unselected channels;And &>Two groups of selected channels are adopted by two operation operator subsets simultaneously;
the super network covers all the frameworks in the form of two parallel paths;
step 1-2: in the process of training the super network, selectively activating each path to participate in training by using binary gating;
for two nodes x in a basic unit i To x j The binary gating of the super network is described as:
wherein the value of gate1 and gate2 is 0 or 1, excluding the situation that gate1 and gate2 are 0 at the same time; binary gating operation carries out value taking in a random sampling mode to selectively activate corresponding paths to participate in training;
step 2: utilizing a sigmoid function to carry out search space serialization and redefine two sub-networks;
where δ (·) represents a sigmoid function, which is calculated as follows:
and step 3: optimizing the super network by using a gradient descending mode to obtain an optimal basic unit comprising a common unit and a reduction unit;
finding the optimal alpha by jointly optimizing the network parameter w and the structure parameter alpha to determine the optimal basic unit:
s.t.w * (α)=arg minL train (w,α)
wherein L is train For training loss, L val In order to verify the loss, cross entropy loss is adopted for both training loss and verification loss;
after obtaining the structural parameter α, according to one-hot encoding:
selecting two operations with the maximum alpha value as the input of the intermediate node of the basic unit;
and 4, step 4: and (3) stacking the basic units obtained in the step (2) to obtain a required deep neural network, and retraining the deep neural network until the network converges.
2. The parallel differentiable neural network architecture searching method according to claim 1, wherein the super network optimization process in step 3 is a network formed by stacking 6 normal units and 2 reduction units, wherein the 2 reduction units are respectively located at 1/3 and 2/3 of the total depth of the network.
3. The method according to claim 1, wherein the deep neural network required in step 4 is a deep neural network for CIFAR-10, and is formed by stacking 20 basic units, wherein each basic unit comprises 2 reduction units and 18 normal units.
4. The parallel differentiable neural network architecture searching method according to claim 1, wherein the deep neural network required in the step 4 is a deep neural network for ImageNet, and is formed by stacking 12 basic units, wherein each basic unit comprises 2 reduction units and 12 common units, and the 2 reduction units are respectively located at 1/3 and 2/3 of the total depth of the network.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211299553.2A CN115906935B (en) | 2022-10-23 | 2022-10-23 | Parallel differentiable neural network architecture searching method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211299553.2A CN115906935B (en) | 2022-10-23 | 2022-10-23 | Parallel differentiable neural network architecture searching method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115906935A true CN115906935A (en) | 2023-04-04 |
CN115906935B CN115906935B (en) | 2024-10-29 |
Family
ID=86490625
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211299553.2A Active CN115906935B (en) | 2022-10-23 | 2022-10-23 | Parallel differentiable neural network architecture searching method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115906935B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117953296A (en) * | 2024-02-01 | 2024-04-30 | 华东交通大学 | Neural network architecture searching method for remote sensing image classification |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113361680A (en) * | 2020-03-05 | 2021-09-07 | 华为技术有限公司 | Neural network architecture searching method, device, equipment and medium |
CN114359109A (en) * | 2022-01-12 | 2022-04-15 | 西北工业大学 | Twin network image denoising method, system, medium and device based on Transformer |
WO2022121100A1 (en) * | 2020-12-11 | 2022-06-16 | 华中科技大学 | Darts network-based multi-modal medical image fusion method |
-
2022
- 2022-10-23 CN CN202211299553.2A patent/CN115906935B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113361680A (en) * | 2020-03-05 | 2021-09-07 | 华为技术有限公司 | Neural network architecture searching method, device, equipment and medium |
WO2022121100A1 (en) * | 2020-12-11 | 2022-06-16 | 华中科技大学 | Darts network-based multi-modal medical image fusion method |
CN114359109A (en) * | 2022-01-12 | 2022-04-15 | 西北工业大学 | Twin network image denoising method, system, medium and device based on Transformer |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117953296A (en) * | 2024-02-01 | 2024-04-30 | 华东交通大学 | Neural network architecture searching method for remote sensing image classification |
Also Published As
Publication number | Publication date |
---|---|
CN115906935B (en) | 2024-10-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110473592A (en) | The multi-angle of view mankind for having supervision based on figure convolutional network cooperate with lethal gene prediction technique | |
CN115906935A (en) | Parallel differentiable neural network architecture searching method | |
CN106647272A (en) | Robot route planning method by employing improved convolutional neural network based on K mean value | |
CN111738477A (en) | Deep feature combination-based power grid new energy consumption capability prediction method | |
CN113297429A (en) | Social network link prediction method based on neural network architecture search | |
CN109800517A (en) | Improved reverse modeling method for magnetorheological damper | |
CN110110447B (en) | Method for predicting thickness of strip steel of mixed frog leaping feedback extreme learning machine | |
CN110972174A (en) | Wireless network interruption detection method based on sparse self-encoder | |
CN113095479B (en) | Multi-scale attention mechanism-based extraction method for ice underlying structure | |
CN117633512A (en) | Reservoir porosity prediction method based on one-dimensional convolution gating circulation network | |
Tahmasebi et al. | Comparison of optimized neural network with fuzzy logic for ore grade estimation | |
CN107256453B (en) | Capillary quality forecasting method based on improved ELM algorithm | |
CN114897139B (en) | Bearing fault diagnosis method for ordered stable simplified sparse quantum neural network | |
CN115953902A (en) | Traffic flow prediction method based on multi-view space-time diagram convolution network | |
CN115062759A (en) | Fault diagnosis method based on improved long and short memory neural network | |
CN115438784A (en) | Sufficient training method for hybrid bit width hyper-network | |
CN116206304A (en) | Tomato leaf disease identification method based on improved convolutional neural network | |
CN112801264B (en) | Dynamic differentiable space architecture searching method and system | |
Hu et al. | A classification surrogate model based evolutionary algorithm for neural network structure learning | |
Zhao et al. | Optimizing radial basis probabilistic neural networks using recursive orthogonal least squares algorithms combined with micro-genetic algorithms | |
CN112860882A (en) | Book concept front-rear order relation extraction method based on neural network | |
CN112348275A (en) | Regional ecological environment change prediction method based on online incremental learning | |
CN118196600B (en) | Neural architecture searching method and system based on differential evolution algorithm | |
Guo et al. | Extracting fuzzy rules based on fusion of soft computing in oil exploration management | |
Li et al. | Lasso regression based channel pruning for efficient object detection model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant |