CN111723852B - Robust training method for target detection network - Google Patents
Robust training method for target detection network Download PDFInfo
- Publication number
- CN111723852B CN111723852B CN202010480420.XA CN202010480420A CN111723852B CN 111723852 B CN111723852 B CN 111723852B CN 202010480420 A CN202010480420 A CN 202010480420A CN 111723852 B CN111723852 B CN 111723852B
- Authority
- CN
- China
- Prior art keywords
- label
- mining
- network
- training
- target detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000012549 training Methods 0.000 title claims abstract description 76
- 238000000034 method Methods 0.000 title claims abstract description 44
- 238000001514 detection method Methods 0.000 title claims abstract description 43
- 238000005065 mining Methods 0.000 claims abstract description 48
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 claims abstract description 19
- 239000010931 gold Substances 0.000 claims abstract description 19
- 229910052737 gold Inorganic materials 0.000 claims abstract description 19
- 238000005070 sampling Methods 0.000 claims abstract description 19
- 238000011176 pooling Methods 0.000 claims abstract description 10
- 238000002372 labelling Methods 0.000 claims abstract description 9
- 238000013528 artificial neural network Methods 0.000 claims abstract description 6
- 238000000605 extraction Methods 0.000 claims abstract description 5
- 230000008569 process Effects 0.000 claims description 16
- 238000009412 basement excavation Methods 0.000 claims description 3
- 230000004913 activation Effects 0.000 claims description 2
- 239000000523 sample Substances 0.000 description 19
- 238000012360 testing method Methods 0.000 description 7
- 238000013527 convolutional neural network Methods 0.000 description 5
- 238000010200 validation analysis Methods 0.000 description 5
- 230000006872 improvement Effects 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000007792 addition Methods 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000013432 robust analysis Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to a robust training method aiming at a target detection network, which comprises the following steps: acquiring a training sample, wherein a part of detection targets on the training sample carry artificial labeling frames; performing feature extraction on the training sample by using a target detection network, and generating a suggestion box on the training sample; marking original sampling labels on the suggestion boxes, wherein the original sampling labels comprise positive labels and negative labels; performing pooling operation on the positive label by adopting a pooling branch, and outputting a first region-of-interest characteristic; inputting the first region of interest characteristics into a mining network, wherein the mining network is a fully-connected neural network, and the mining network generates a new suggestion box label, namely a mining label; fusing the mining label and the original sampling label to generate a gold label; and using the gold label for training the target detection network.
Description
Technical Field
The invention relates to the technical field of computer vision and target detection, in particular to a robust training method for a target detection network.
Background
In recent years, a Convolutional Neural Network (CNN) based object detection framework has become a powerful method for various computer vision tasks and has been widely applied in object localization and object statistics tasks. At the same time, the Convolutional Neural Network (CNN) based object detection framework has continued to improve and a number of excellent architectures have been proposed. Among them, region-based detection frameworks (e.g., fasternn, FPN), which include a pre-processing step proposed for a region, are widely used due to their more accurate detection performance. At the same time, many approaches continue to improve the performance of feature extractors by optimizing their network architecture. However, how to enhance the training robustness under non-optimal parameters and the trainability of the network under various label qualities has been proposed little.
Disclosure of Invention
The present application is proposed to solve the above technical problem, and provides a robust training method for a target detection network.
According to an aspect of the present application, there is provided a robust training method for a target detection network, including: acquiring a training sample image, wherein a part of detection targets on the training sample image carry artificial labeling frames; performing feature extraction on the training sample image by using a target detection network, and generating a suggestion box on the training sample image; marking original sampling labels on the suggestion boxes, wherein the original sampling labels comprise positive labels and negative labels; performing pooling operation on the positive label by adopting a pooling branch, and outputting a first region-of-interest characteristic; inputting the first region of interest characteristics into a mining network, wherein the mining network is a fully-connected neural network, and the mining network generates a new suggestion box label, namely a mining label; fusing the mining label and the original sampling label to generate a gold label; and using the gold label for training the target detection network.
Compared with the prior art, the robust training method for the target detection network is adopted, the processes of suggestion frame mining and label fusion are added in the training process of the target detection network, the phenomenon that the suggestion frame is wrongly annotated or a sample has too many false positives due to the fact that a manual annotation frame is missing or the set threshold (the first threshold and the second threshold) is too high or too low is effectively overcome, and the anti-interference performance of the network training process is improved.
Drawings
The above and other objects, features and advantages of the present application will become more apparent by describing in more detail embodiments of the present application with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the principles of the application. In the drawings, like reference numbers generally represent like parts or steps.
FIG. 1 is a flow chart of a robust training method for a target detection network of the present invention;
FIG. 2 is a segmentation of the processing stage of FIG. 1;
FIG. 3 is some positive label examples generated when training on a sparse VOC2007 training set;
FIG. 4 is a comparison graph (1) of results of a target detection network obtained by a common training method and a training method proposed in the present application under sparse COCO;
fig. 5 is a comparison graph (2) of the results of the target detection network obtained by the common training method and the training method proposed in the present application under COCO.
Detailed Description
Hereinafter, example embodiments of the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only a few embodiments of the present application, and not all embodiments of the present application, and it should be understood that the present application is not limited to the example embodiments described herein.
Summary of the application
Taking FasterRCNN in a target detection network as an example, FasterRCNN generates a suggestion box during training, then calculates the intersection ratio of the suggestion box and a labeling box, marks a category label (positive sample) on the suggestion box if the intersection ratio is larger than a manually set threshold value, and marks a background label (negative sample) on the suggestion box otherwise, and trains the network by taking the label as the positive and negative sample. However, if the manual labeling frame in the image is missing, the suggestion frame will be labeled with an error label. In addition, if the manually set threshold is not optimal, the sampling performance of the positive and negative samples is affected, and if the threshold is set too high, too many positive samples are lost, so that the capability of the network for identifying the target is reduced; if the threshold is set too low, too many false positives will appear in the sampled sample, interfering with the normal training process of the network and affecting the final performance.
Aiming at the technical problems, the invention aims to improve the training robustness of the pathological image detection network under the training data with different labeling qualities and non-optimal parameters. The core component of the present invention is a neural network named "mining network". The mining network is able to learn the characteristics of the positive samples and mine potential positive samples in the mined images. Since mined positive samples typically contain positive samples that are lost due to non-optimal parameters and missing annotations. In this way, combining the mined positive samples with the originally sampled positive samples will recover the missing positive proposals due to improper manual parameter settings and lack of samples.
Having described the general principles of the present application, various non-limiting embodiments of the present application will now be described with reference to the accompanying drawings.
Exemplary method
As shown in fig. 1, the robust training method for a target detection network includes:
s10, acquiring a training sample image, wherein a part of detection targets on the training sample image carry a manual labeling frame;
s20, performing feature extraction on the training sample image by using a target detection network, and generating a suggestion box on the training sample image;
the task of the target detection network is to locate and identify an object from an image, and the image space is an Euclidean space which is not an effective feature separable space, so that a feature extractor is needed to be used for feature combination of field pixels of the image, and features of a larger range and even the whole image are mapped to a high-dimensional separable space. Because the performance of the network is closely related to the separability of the feature space, the backbone network of the target detection network often utilizes a mainstream classification network that has been widely verified to extract and combine features. The classification networks are usually pre-trained on a large-scale public data set, so that the search range of a parameter space of network parameters is effectively limited by a transfer learning mode, and the training difficulty of the network on a new target detection task is further reduced. Therefore, in the invention, the classification model ResNet101 after pre-training is used as a backbone network to execute a feature extraction task.
S30, marking the original sample label on the suggestion box: judging whether the intersection ratio of the suggestion frame and the manual labeling frame is larger than a set first threshold value or not, if so, marking the suggestion frame as a positive label, judging whether the intersection ratio of the suggestion frame and the manual labeling frame is smaller than a set second threshold value or not, if so, marking the suggestion frame as a negative label, and both the positive label and the negative label are original sampling labels;
the proposed boxes that are neither positive nor negative do not help the network training, and therefore the number of positive labels is crucial to the training of the detector.
S40, adopting two pooling branches, performing pooling operation on the suggestion boxes marked as positive labels respectively, and outputting a first region-of-interest feature and a second region-of-interest feature;
the region of the feature map corresponding to the positive label is also called a region of interest (RoI), and two parallel pooling branches are used to pool the region of the feature map corresponding to the RoI, and the RoI feature (i.e., the first RoI feature) and the RoI feature (i.e., the second RoI feature) for mining are output respectively. The two parallel different branch structures of the RoI pooling ensure that the mining process does not interfere with the training process of the detector. And inputting the RoI characteristics (namely the second region of interest characteristics) into the target detection network, and outputting the result by the target detection network.
S50, inputting the first region of interest features into a mining network, wherein the mining network is a fully-connected neural network and generates a new suggestion box label, namely a mining label;
the mining network is a fully-connected neural network, the input of which is the RoI characteristic used for mining, and the hidden layer of which can be one or more layers. The mining network outputs a probability distribution (mining score) with the suggested box category activated by softmax, then the mining score is subjected to one-hot coding, and a suggested box mining label represented by m is generated. This process can be expressed as:
wherein, m represents the suggestion box mining label,is to excavate the network(s),is the input RoI feature used for mining.
S60, fusing the mining label and the original sampling label to generate a gold label, wherein the gold label is used as a final label of the suggestion frame;
and (4) the label obtained by fusing the mining label of the suggestion box and the original sampling label is called a gold label, and the gold label is used as a real label for detection training. By generating the gold tags through the merging operation, it can be ensured that the performance of the probe is not affected even under the worst condition (excavation network is invalid).
Specifically, the gold tag (g) is the union of the original sampling tag (a) and the suggestion box mining tag (m). Some false negative tags (which should be positive but sampled negative) in the original sample tags will be corrected by the suggestion box mining tag by the merge operation. Therefore, lost due to improper manual thresholds and missing annotations will be recoveredA number of positive tags, the tag merge process can be expressed as:,
And S70, using the suggestion box corresponding to the gold label for training the target detection network. The total loss for network training is represented by the following equation:,
where p is the final output with the probability distribution of softmax activation,is a loss of positioning;is the cross entropy loss, which can be expressed as:,
wherein,is the number of the proposed boxes,the classification probability distribution of the proposed boxes indexed by i, which are output by the FastRCNN branch,the method comprises the steps that a suggested frame gold label is indexed according to i, and an original sampling label is optimized through the gold label;
the expression mining loss, i.e. the cross entropy loss of the training mining network, can be expressed as:whereinindicates a tag indexed by i among the assigned tags,is the output indexed by i in the mining network. Obviously, the labels used to train the mining network are sampling labels. Typically, each training step has hundreds of recommendations with tags, and thus sampling hundreds of tags, ensures that the mining network can adequately learn the characteristics of positive tags, and thus, can use themAnd training the whole target detection network. The loss function comprises classification loss and positioning loss, wherein the classification loss comprises cross entropy loss in the applicationAnd excavation lossThe algorithm for the positioning loss follows the conventional calculation method, which is not described herein again.
As shown in fig. 2, a general R-CNN training process is shown by a dotted line in the figure, and a suggested frame is obtained by further correcting the position of a default recommended region (e.g., "anchor point" in fasterncn), and then a class label (or background label) is assigned to the suggested frame and used as a training sample image for training a detector. The process of suggested frame mining and label fusion is added in the training process of the application, as shown by a chain line in fig. 2, the phenomenon that the suggested frame is wrongly marked or the sample is excessively false positive due to the fact that the manual marking frame is missing or the set threshold (the first threshold and the second threshold) is too high or too low is effectively overcome, and the anti-interference performance of the network training process is improved.
To verify the validity of this patent, experiments were performed on the paschalloc 2007 and MSCOCO2017 datasets. The paschaloc 2007 consisted of 5k training images and 5k test images for approximately 20 classes of subjects. The COCO data set contained about 11.8 ten thousand training images and 5k validation images, and was tested using the validation set. The sparse data set is created manually by deleting annotations randomly until only one annotation per training image per class, as shown in fig. 3 (a) (sparse annotation). Sparse operation is only performed on the training set of the PASCL and the training set of the COCO, and the test set of the PASCL and the verification set of the COCO are kept complete.
1. Experimental parameters and details:
in the experiment, FasterRCNN is adopted in the target detection network, a characteristic extractor is ResNet101, and ResNet101 is pre-trained on ImageNet. The number of training steps is 150k on PASCAL, 1500k on COCO and 1 for blocksize. The learning rate was initially set at 0.0001, divided by 10 at 60k/600k steps for PASCAL and 10 at 80k/800k steps for COCO. Zooming the image in the training process to make the length of the short edge be 600 pixels; the maximum length of the long side is 1000 pixels. In addition, the images are randomly flipped horizontally to enhance the training data. The intersection ratio IoU of the suggestion box with the annotation is higher than 0.5 and is assigned a positive label, otherwise, it is a negative label.
2. Quantitative results:
TABLE 1 fast-RCNN trained on PASCAL training set, and average accuracy (nAP) and Average Recall (AR) results evaluated on PASCAL test set
Table 1 lists the results evaluated on the PASCAL2007 test set, and under the training of sparse PASCAL, the method of the patent improves the mapp (mean average precision) by 3.0% and the AR by 2.1%. Meanwhile, the method of the patent realizes 0.7% AR (average recall) improvement on the original PASCAL.
TABLE 2 mean accuracy Ap results using fast-RCNN trained on the MSCOCO training set and evaluated on the MSCOCO validation set
Table 2 shows the results evaluated on the validation set of the COCO dataset, with the method of the invention increasing the AP trained on sparse COCO and complete COCO by 1.6% and 1.0%, respectively. In addition, the method of the invention improves the AP @0.5 of 3.0% and 2.5% respectively under sparse and complete COCO, and the AP @0.5 means the statistical result under a single threshold value of 0.5. AP-s, AP-m, AP-l are AP indices for small, medium and large targets, respectively.
3. And (3) robustness analysis:
TABLE 3 mean recall AR results using fast-RCNN trained on the MSCOCO training set and evaluated on the MSCOCO validation set
In Table 3, the AR results of the present invention (19.70 and 25.7 AR) are not much improved over the original FasterRCNN (17.4 and 23.5). In this section, the training performance of the object detection network will be explored, as well as the effectiveness of the present invention under different conditions IoU thresholds. The number of positive suggestion boxes at different IoU thresholds at the last iteration cycle on the PASCAL training set is counted and the average number of positive suggestions per image is reported. At the same time, the mAP results of the networks trained on the PASCAL training set are given and evaluated on the test set.
TABLE 4 number of positive advice boxes averaged over last training period (different IoU thresholds), and mAP results evaluated by PASCAL test set
As shown in table 4, the maps results of the method of the invention outperformed fasternn except for IoU when the threshold was 0.3. However, with the increasing IoU threshold, the method of the invention can achieve more significant mAP improvement, for example, when the IoU threshold is 0.6, 0.7 and 0.8, respectively, the mAP improvement of the method of the invention is 1.0%, 2.7% and 6.8%, respectively.
4. Qualitative results
In fig. 4 and 5, some of the test results generated by the method of this patent are illustrated, which are compared to Faster. Fasterncn trained on sparse COCO datasets tends to miss some objects (red dashed box), and the method of this patent largely avoids this error. Meanwhile, the method of the patent obtains more accurate prediction on the COCO data set.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present application. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the application. Thus, the present application is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the application to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.
Claims (2)
1. The robust training method for the target detection network is characterized by comprising the following steps of:
acquiring a training sample image, wherein a part of detection targets on the training sample image carry artificial labeling frames;
performing feature extraction on the training sample image by using a target detection network, and generating a suggestion box on the training sample image;
marking original sampling labels on the suggestion boxes, wherein the original sampling labels comprise positive labels and negative labels;
performing pooling operation on the positive label by adopting a pooling branch, and outputting a first region-of-interest characteristic;
inputting the first region of interest characteristics into a mining network, wherein the mining network is a fully-connected neural network, and the mining network generates a new suggestion box label, namely a mining label;
fusing the mining label and the original sampling label to generate a gold label;
using the gold label for training of the target detection network;
the generation process of the mining label comprises the following steps: inputting the first region of interest feature into a mining network, outputting the probability distribution with the suggested box category activated by softmax by the mining network, performing one-hot coding on the probability distribution, and generating the mining tag, which is specifically represented as:
wherein, m represents a digging label,it is shown that the network is mined,representing a first region of interest feature;
the gold label is a union of the original sampling label and the mining label, and through merging operation, a false negative label which is originally a positive label but is marked as a negative label in the original sampling label is corrected by the mining label and is recovered to be a positive label;
2. The robust training method for an object detection network as claimed in claim 1, wherein the loss function for object detection network training is:
wherein p is the probability distribution with the suggested box category activated by softmax, g represents gold label;
wherein,the number of the suggested boxes is represented,representing the probability distribution indexed by i with the category of the suggestion box after activation by softmax,gold tags representing indices by i;
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010480420.XA CN111723852B (en) | 2020-05-30 | 2020-05-30 | Robust training method for target detection network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010480420.XA CN111723852B (en) | 2020-05-30 | 2020-05-30 | Robust training method for target detection network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111723852A CN111723852A (en) | 2020-09-29 |
CN111723852B true CN111723852B (en) | 2022-07-22 |
Family
ID=72565402
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010480420.XA Active CN111723852B (en) | 2020-05-30 | 2020-05-30 | Robust training method for target detection network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111723852B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113221970A (en) * | 2021-04-25 | 2021-08-06 | 武汉工程大学 | Deep convolutional neural network-based improved multi-label semantic segmentation method |
CN114612717B (en) * | 2022-03-09 | 2023-05-26 | 四川大学华西医院 | AI model training label generation method, training method, using method and equipment |
CN117572531B (en) * | 2024-01-16 | 2024-03-26 | 电子科技大学 | Intelligent detector embedding quality testing method and system |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106599939A (en) * | 2016-12-30 | 2017-04-26 | 深圳市唯特视科技有限公司 | Real-time target detection method based on region convolutional neural network |
CN108197687A (en) * | 2017-12-27 | 2018-06-22 | 江苏集萃智能制造技术研究所有限公司 | A kind of webpage two-dimensional code generation method |
CN108416287A (en) * | 2018-03-04 | 2018-08-17 | 南京理工大学 | A kind of pedestrian detection method excavated based on omission negative sample |
CN108875819A (en) * | 2018-06-08 | 2018-11-23 | 浙江大学 | A kind of object and component associated detecting method based on shot and long term memory network |
CN108960143A (en) * | 2018-07-04 | 2018-12-07 | 北京航空航天大学 | Detect deep learning method in a kind of naval vessel in High Resolution Visible Light remote sensing images |
CN109285139A (en) * | 2018-07-23 | 2019-01-29 | 同济大学 | A kind of x-ray imaging weld inspection method based on deep learning |
CN109800778A (en) * | 2018-12-03 | 2019-05-24 | 浙江工业大学 | A kind of Faster RCNN object detection method for dividing sample to excavate based on hardly possible |
WO2019238976A1 (en) * | 2018-06-15 | 2019-12-19 | Université de Liège | Image classification using neural networks |
CN110610210A (en) * | 2019-09-18 | 2019-12-24 | 电子科技大学 | Multi-target detection method |
CN110716792A (en) * | 2019-09-19 | 2020-01-21 | 华中科技大学 | Target detector and construction method and application thereof |
CN111091105A (en) * | 2019-12-23 | 2020-05-01 | 郑州轻工业大学 | Remote sensing image target detection method based on new frame regression loss function |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10657364B2 (en) * | 2016-09-23 | 2020-05-19 | Samsung Electronics Co., Ltd | System and method for deep network fusion for fast and robust object detection |
US10579897B2 (en) * | 2017-10-02 | 2020-03-03 | Xnor.ai Inc. | Image based object detection |
US11144065B2 (en) * | 2018-03-20 | 2021-10-12 | Phantom AI, Inc. | Data augmentation using computer simulated objects for autonomous control systems |
KR20200052446A (en) * | 2018-10-30 | 2020-05-15 | 삼성에스디에스 주식회사 | Apparatus and method for training deep learning model |
-
2020
- 2020-05-30 CN CN202010480420.XA patent/CN111723852B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106599939A (en) * | 2016-12-30 | 2017-04-26 | 深圳市唯特视科技有限公司 | Real-time target detection method based on region convolutional neural network |
CN108197687A (en) * | 2017-12-27 | 2018-06-22 | 江苏集萃智能制造技术研究所有限公司 | A kind of webpage two-dimensional code generation method |
CN108416287A (en) * | 2018-03-04 | 2018-08-17 | 南京理工大学 | A kind of pedestrian detection method excavated based on omission negative sample |
CN108875819A (en) * | 2018-06-08 | 2018-11-23 | 浙江大学 | A kind of object and component associated detecting method based on shot and long term memory network |
WO2019238976A1 (en) * | 2018-06-15 | 2019-12-19 | Université de Liège | Image classification using neural networks |
CN108960143A (en) * | 2018-07-04 | 2018-12-07 | 北京航空航天大学 | Detect deep learning method in a kind of naval vessel in High Resolution Visible Light remote sensing images |
CN109285139A (en) * | 2018-07-23 | 2019-01-29 | 同济大学 | A kind of x-ray imaging weld inspection method based on deep learning |
CN109800778A (en) * | 2018-12-03 | 2019-05-24 | 浙江工业大学 | A kind of Faster RCNN object detection method for dividing sample to excavate based on hardly possible |
CN110610210A (en) * | 2019-09-18 | 2019-12-24 | 电子科技大学 | Multi-target detection method |
CN110716792A (en) * | 2019-09-19 | 2020-01-21 | 华中科技大学 | Target detector and construction method and application thereof |
CN111091105A (en) * | 2019-12-23 | 2020-05-01 | 郑州轻工业大学 | Remote sensing image target detection method based on new frame regression loss function |
Non-Patent Citations (2)
Title |
---|
Study of object detection based on Faster R-CNN;BIN LIU等;《2017 Chinese Automation Congress(CAC)》;20171022;第6233-6236页 * |
基于改进Mask RCNN的不规则3D物体抓取点识别;唐博恒;《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》;20190815(第08期);第I138-661页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111723852A (en) | 2020-09-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111723852B (en) | Robust training method for target detection network | |
CN109598684B (en) | Correlation filtering tracking method combined with twin network | |
CN107194418B (en) | Rice aphid detection method based on antagonistic characteristic learning | |
JP2019091443A (en) | Open set recognition method and apparatus, and computer readable storage medium | |
CN107424161B (en) | Coarse-to-fine indoor scene image layout estimation method | |
CN109977895B (en) | Wild animal video target detection method based on multi-feature map fusion | |
CN111340130A (en) | Urinary calculus detection and classification method based on deep learning and imaging omics | |
CN112017192B (en) | Glandular cell image segmentation method and glandular cell image segmentation system based on improved U-Net network | |
JP2006252559A (en) | Method of specifying object position in image, and method of classifying images of objects in different image categories | |
JP5176763B2 (en) | Low quality character identification method and apparatus | |
CN111008576B (en) | Pedestrian detection and model training method, device and readable storage medium | |
CN111738048B (en) | Pedestrian re-identification method | |
CN114429649B (en) | Target image identification method and device | |
CN114998220A (en) | Tongue image detection and positioning method based on improved Tiny-YOLO v4 natural environment | |
CN114255403A (en) | Optical remote sensing image data processing method and system based on deep learning | |
CN112861785B (en) | Instance segmentation and image restoration-based pedestrian re-identification method with shielding function | |
CN116188879B (en) | Image classification and image classification model training method, device, equipment and medium | |
CN111898566B (en) | Attitude estimation method, attitude estimation device, electronic equipment and storage medium | |
JP2008251029A (en) | Character recognition device and license plate recognition system | |
CN117671597B (en) | Method for constructing mouse detection model and mouse detection method and device | |
CN114399731A (en) | Target positioning method under single-coarse-point supervision | |
CN114037886A (en) | Image recognition method and device, electronic equipment and readable storage medium | |
CN111612749B (en) | Focus detection method and device based on lung image | |
WO2024093466A1 (en) | Person image re-identification method based on autonomous model structure evolution | |
Huang et al. | On the Concept Trustworthiness in Concept Bottleneck Models |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |