Nothing Special   »   [go: up one dir, main page]

CN112101113B - Lightweight unmanned aerial vehicle image small target detection method - Google Patents

Lightweight unmanned aerial vehicle image small target detection method Download PDF

Info

Publication number
CN112101113B
CN112101113B CN202010819487.1A CN202010819487A CN112101113B CN 112101113 B CN112101113 B CN 112101113B CN 202010819487 A CN202010819487 A CN 202010819487A CN 112101113 B CN112101113 B CN 112101113B
Authority
CN
China
Prior art keywords
target
sequence
frame
branch
central point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010819487.1A
Other languages
Chinese (zh)
Other versions
CN112101113A (en
Inventor
李红光
王蒙
丁文锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN202010819487.1A priority Critical patent/CN112101113B/en
Publication of CN112101113A publication Critical patent/CN112101113A/en
Application granted granted Critical
Publication of CN112101113B publication Critical patent/CN112101113B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/13Satellite images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/32Normalisation of the pattern dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Astronomy & Astrophysics (AREA)
  • Remote Sensing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a light unmanned aerial vehicle image small target detection method, and belongs to the technical field of unmanned aerial vehicle image processing. The invention processes each frame of image according to the sequence of the input image video of the unmanned aerial vehicle to be detected, comprising the following steps: the image is input into a Revised Mobile Net 2 feature extractor after being zoomed, and a feature map is output; inputting the feature map into a synchronous up-sampling and detecting module, and obtaining the position and the corresponding scale of a target central point to obtain all predicted target boundary frames in a frame; and after processing all frames of the video to be detected, performing rapid sequence non-maximum suppression processing on prediction results of all frames, and outputting a target detection result of the video to be detected. The invention uses the light backbone network to detect the small targets in the unmanned aerial vehicle image, reduces the false detection of the small targets, improves the detection efficiency, and can realize the quick and accurate detection of the small targets in the unmanned aerial vehicle image video.

Description

Lightweight unmanned aerial vehicle image small target detection method
Technical Field
The invention belongs to the technical field of unmanned aerial vehicle image processing, and particularly relates to a light unmanned aerial vehicle image small target detection method.
Background
With the maturity of unmanned aerial vehicle technique and the increase of unmanned aerial vehicle supplier quantity, unmanned aerial vehicle cost reduces gradually, and in recent years, unmanned aerial vehicle all receives extensive concern in a plurality of fields such as geology, agriculture and forestry, stream of people/traffic control. Unmanned aerial vehicle self can carry multiple peripheral hardware sensor, including infrared image sensor, visible light image sensor, acceleration sensor, baroceptor etc. wherein visible light image sensor can provide abundant environmental information, consequently, the technique is understood to unmanned aerial vehicle visible light image is one of the popular field of unmanned aerial vehicle application research. The target detection technology can locate interested category targets in the image, and undoubtedly can provide effective support for various unmanned aerial vehicle tasks.
According to the definition of MS COCO (Microsoft Common Objects in Context) data set, an object with a number of pixels ≦ 32 × 32 is considered a small object. Typical targets in unmanned aerial vehicle images have the characteristics of small size, large quantity and dense distribution.
The target detection technology has been developed for a long time, and the detection precision is continuously improved from the traditional method based on manual design characteristics to the deep learning method based on the convolutional neural network. Currently, a convolutional neural network-based method is the mainstream method of the target detection technology. The optimization target of most unmanned aerial vehicle visible light image target detection algorithms is to improve the precision as much as possible, and the efficiency problem is rarely considered. Target detection based on unmanned aerial vehicle machine upper mounting plate has the significance, can not only promote the flexibility that unmanned aerial vehicle used and the intelligent level of unmanned aerial vehicle self, can also overcome abominable communication environment and carry out work. However, storage and computational resources of the platform on the drone are limited, requiring a low computational load and parameter for the target detection algorithm.
Disclosure of Invention
Based on the importance of the existing unmanned aerial vehicle target detection and the requirements of low calculated amount and parameter amount of a detection method, the invention provides a light unmanned aerial vehicle image small target detection method aiming at the scene of target detection of a visible light unmanned aerial vehicle image by combining an unmanned aerial vehicle upper platform. The unmanned aerial vehicle image data source is in a video format, namely, the input unmanned aerial vehicle image is each frame of a video according to a time sequence.
The invention provides a light unmanned aerial vehicle image small target detection method, which comprises the following steps:
the method comprises the following steps: scaling the current frame image to be detected to 512 x 512 pixels;
step two: inputting the zoomed image into a Revised mobilenetV2 feature extractor, and outputting a feature map with the dimension of 16 multiplied by 16;
step three: and inputting the extracted feature map into a synchronous up-sampling and detecting module. The synchronous upsampling and detection module contains four branches based on the subpixel convolution structure. The four branches are respectively a central point branch, a central point offset branch, a central point target branch and a scale branch. The first three branches jointly determine the position of a central point, and the scale branches determine the scale of each central point corresponding to the target;
step four: and obtaining all predicted target frames of the current frame according to the predicted target central point position and the corresponding scale, and storing the result. Judging whether all frames of the current video to be detected are detected, if so, entering the fifth step after the detection is finished, otherwise, returning to the first step for the undetected frames to continue to execute;
step five: and performing rapid sequence non-maximum suppression processing on all frame prediction results of the video to be detected to obtain a final target detection result.
The fast sequence non-maximum suppression processing comprises the steps of firstly suppressing and removing duplication of a prediction target frame, and then sequentially selecting and re-scoring the prediction target frame sequence; the sequence selection refers to selecting the first K predicted target frames of each frame, which are subjected to duplicate removal and then are subjected to score descending arrangement, calculating the IOU value between the predicted target frames of the adjacent front and back frames, associating the predicted target frames with the IOU value larger than a threshold B, obtaining a plurality of overlapped time sequence target frame sequences with different lengths corresponding to the whole video sequence at the moment, and selecting the time sequence target frame sequence with the maximum total score of the target frames; the re-scoring refers to assigning the score average value of the target frames of the sequence to each boundary frame in the sequence for the selected time sequence target frame sequence; after the three steps are executed, excluding the time sequence target frame sequence with the maximum total score, and selecting and re-scoring the prediction target frame sequence again until no time sequence target frame sequence can be selected; wherein K is a positive integer and B is a real number greater than 0.
Compared with the prior art, the invention has the advantages and positive effects that: (1) the invention carries out lightweight design on a typical frame based on a central point prediction method, uses a lightweight backbone network, designs a synchronous up-sampling and detection module, and forms a more efficient detection framework; (2) according to the invention, a target branch of a binary center point is added at the detection head, so that new information is brought to the center point prediction, and the false detection of small targets is reduced to a certain extent; (3) the invention provides a rapid sequence non-maximum suppression method in a post-processing part, and a detection result is optimized by combining target time sequence information. The method can realize the rapid and accurate detection of the target on the unmanned aerial vehicle.
Drawings
FIG. 1 is an exemplary framework for a center point based prediction approach;
FIG. 2 is a schematic diagram of a lightweight network for detecting small targets of unmanned aerial vehicle images according to the present invention;
FIG. 3 is a flow chart of a fast sequence non-maxima suppression method of the present invention;
fig. 4 is a flow chart of the light unmanned aerial vehicle image small target detection framework of the invention.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples.
The method of the present invention is designed based on a typical framework of the center point prediction method, as shown in fig. 1. The input Image (Image) is input into a Feature Extractor (Feature Extractor) for Feature extraction after being zoomed (Resize), the extracted features are input into a Detection Head (Detection Head) for predicting a target Center Point (Center Point) and a Scale (Scale) after being Up-sampled (Up-sampling). The feature extractor is typically a classification network based architecture, and the feature extractor and the upsampling module together form a backbone network based on a central point prediction framework. The detection head is responsible for target prediction according to the extracted features and comprises three branches: a center point branch, a center point offset branch, and a scale branch.
As shown in fig. 2, the light unmanned aerial vehicle image target detection framework mainly comprises the following parts:
the first part is the Revised MobileNetV2 feature extractor, which acts as the backbone of the network structure of the method of the present invention. The second part is a synchronous Up-sampling and Detection Module (Simultaneous Up-sampling and Detection Module) which is used as a Detection head of the network structure of the method and has the Up-sampling function. The synchronous upsampling and detection module includes four branches, which are a center point branch cls, a center point offset branch offset, a center point target branch obj, and a scale branch wh, respectively.
(1) Revised MobileNetV2 feature extractor.
The feature extractor of the present invention uses a structure based on a lightweight classification network MobileNetV 2. The MobileNetV2 adopts a depth separable convolution and Inverted Residual (Inverted Residual) structure, and is a mobile terminal network structure which is widely applied to various tasks such as detection and segmentation. Generally, a structure before a classification network pooling layer is selected to be used for feature extraction in other task algorithms, experiments show that the structure before the MobileNetV2 pooling layer is used as a backbone network of the invention, and the last layer of 1 × 1 convolution (Conv) has negative influence on detection accuracy. The MobileNetV2 is a structure designed for classification tasks, the final goal is to obtain feature vectors with good discrimination, and then the feature vectors are classified through a full connection layer, so that MobileNetV2 is not necessarily completely suitable for detection tasks. Just as MobilNetV2 uses a linear Bottleneck structure (bottleeck) instead of the traditional nonlinear Bottleneck structure, the signature of the final layer of 1 × 1 convolution output after nonlinear RELU activation may lose some necessary information for detection relative to the signature of the final linear residual block output. Meanwhile, the dimension of the output of the last layer of 1 × 1 convolution is quite large, namely 1280 dimension, which also brings great calculation burden to a subsequent synchronous up-sampling and detection module, so the invention removes the last layer of 1 × 1 convolution of the MobileNet V2 feature extractor to form a Revised MobileNet V2 feature extractor.
(2) A synchronous upsampling and detection module.
Deconvolution is a common upsampling method, and compared with Up-sampling or Up-posing methods, the method has the advantages that deconvolution is learnable and results can be made finer. However, deconvolution is also limited by the large computational complexity and checkerboard effect.
The present invention uses sub-pixel convolution instead of deconvolution, which is also a learning-based upsampling method and can be defined as:
FMHR=PS(WL×FMLR+bL)
wherein PS is a Periodic Shuffling operation (Periodic Shuffling) of low resolution H × W × C · r2High-resolution feature map FM with feature map rearranged to rH × rw × C scaleHR。WLAnd bLIs a convolution operator for low resolution feature map FMLRIs carried out r2The dimension of the multiple is raised, and r is the upsampling multiple. Briefly, the sub-pixel convolution is to perform dimension raising on the input feature map by using the convolution layer, and then obtain the up-sampling output by periodic rearrangement. Because the principle of the sub-pixel convolution and the principle of the deconvolution are different, the sub-pixel convolution has no chessboard effect, and simultaneously because the sub-pixel convolution and the detection head have the common convolution structure, the invention shares the calculation of the two layers to form a synchronous up-sampling and detection module.
The synchronous up-sampling and detecting module can be seen as a detecting head integrated with the up-sampling function, the feature diagram to be up-sampled is directly input into each branch of the detecting head, each branch comprises a layer of 1 × 1 convolution structure, and the output dimension of the branch is 8 of the final output dimension of the corresponding branch2And (4) multiplying the sum, namely r is 8, and periodically rearranging the output of the last layer of convolution to obtain the prediction result of the corresponding branch. The design utilizes the characteristic of sub-pixel convolution, not only solves the problem of overlarge deconvolution parameter quantity, but also shares the calculation of the detection head and the up-sampling, and further reduces the calculation burden of the algorithm.
The input of the synchronous upsampling and detecting module is a 16 × 16-scale feature map output by a Revised MobileNetV2 feature extractor, and the synchronous upsampling and detecting module has an 8-time upsampling function, so that the outputs of a central point branch, a central point offset branch, a central point target branch and a scale branch in the synchronous upsampling and detecting module are all 128 × 128-scale. The structure of each branch is basically the same, and all the branches are composed of a 1 × 1 convolutional layer and a periodic rearrangement operation, and only the output dimension is different. The output of the central point branch is a heat map with category dimension, the output of the central point target branch is a heat map with constant 2 dimension, the output of the central point offset branch is the offset corresponding to each central point with constant 2 dimension, and the output of the scale branch is the scale corresponding to each central point with constant 2 dimension.
Compared with the branch definition of the detection head in the conventional framework based on the central point prediction, the invention adds a central point target branch. Because the invention designs the extremely light feature extractor and the detection head, the resolution capability of the extracted features and the detection head to the features is poor, and the risk of missing detection and error detection is brought in unmanned aerial vehicle images which have small target size, dense distribution and difficult feature learning. Especially in unmanned aerial vehicle images with complex environments, complex backgrounds are extremely easy to be mistakenly detected as targets. Inspired by YOLO (young Only Look once), YOLO additionally predicts an objective score for each anchor block, and the physical meaning of the objective score is the iou (interaction Over union) value of the anchor block and the true value block. YOLO performs detection, the actual score of the target box is the product of the classification score and the targetability score. The invention adds a central point target branch for detecting the head, which is a two-classification branch, and the branch predicts a two-classification heat map, namely whether a certain point in a marked image space is the central point of any target of interest class, and does not go to specific classification. Therefore, the invention adds new information in the prediction process of the target center point, and is beneficial to generating better prediction results. The heat maps output by the two-classification central point target branch and the multi-classification central point branch are independently trained and predicted. Integral loss function L of the inventive algorithmdetIs defined as:
Ldet=LclssizeLsizeoffLoffobjLobj
wherein L isclsThe specific definition is the same as for the centret network for central point classification Loss and for Focal local Loss. L issizeFor the loss of target box dimension, LoffFor center point offset loss, both are L1 losses, λsizeAnd λoffIs the coefficient corresponding to the loss. L isobjTarget Loss for center point, Focal LossIs as defined for Lcls,λsizeFor the coefficient corresponding to the target loss, the present invention takes 0.5. When the detection is performed, the final target frame Score is obtained by combining the classification Score and the target Score of each central point:
Score=heatmapcls×F(heatmapobj)
Figure BDA0002633960300000051
where F (x) is a preprocessing function of the targeting score, where x represents the value of a pixel in the heat map, heatmapclsHeatmap, class-by-class prediction, output for a central point branchobjA binary targeted heatmap output for the central point targeted branch. The introduction of the targeted branch reduces the fraction of the bounding box with low targeting, so that the risk of false alarm can be effectively reduced.
A specific structural parameter of the target detection network implemented by the present invention is shown in table 1:
table 1 structural parameter table of object detection network of the present invention
Figure BDA0002633960300000052
Wherein Conv2d refers to a conventional convolutional layer, bottleeck refers to an inverted residual error module in a MobileNet, t, c, n and s are parameters of an inverted residual error structure, t is a channel expansion coefficient, c is an output dimension, n is the number of the inverted residual error modules, and s is a step size. cls is the number of classes detected and ratio is the upsampling multiple.
Therefore, after the target prediction boundary frames of all frames are obtained, the invention uses Fast sequence Non-maximum Suppression (Fast seq-NMS) to carry out de-duplication processing on the target detection results of all frames and corrects the detection results through the time sequence connection of the inter-frame targets.
In a time-series image sequence, the same object exists in a large probability between adjacent frames, and the position and the scale of the object are changed to a small degree. Based on this assumption, the score of the prediction bounding box can be reevaluated by establishing inter-frame association of the same target based on the IOU value between the prediction boxes of adjacent frames based on the result of the target detection network prediction. Sequence non-maximum suppression (Seq-NMS) as shown in figure 3, comprises 3 steps: 1) selection of bounding box sequences, 2) scoring of bounding box sequences, 3) suppression of deduplication. The bounding box sequence selection firstly calculates the IOU value between the predicted target bounding boxes of all adjacent front and back frames, then associates the predicted boxes with the IOU larger than a threshold B, at the moment, the whole video sequence can generate a plurality of overlapped time sequence bounding box sequences with different lengths, and finally selects the bounding box sequence with the maximum total score of the predicted target boxes. Bounding box sequence re-scoring assigns a fractional average of the selected bounding box sequence to each bounding box in the sequence. Suppression deduplication removes selected bounding box sequences, while removing bounding boxes within the frame that are larger than a certain threshold from the selected bounding box sequence elements IOU. After 3 steps are completed, a new selection of bounding box sequences is restarted until no sequences can be selected. The threshold B is typically set to 0.5.
Considering more prediction target frames, the invention provides a more efficient Fast sequence non-maximum suppression (Fast seq-NMS) method in consideration of improving efficiency. The Fast seq-NMS first performs deduplication of the prediction bounding box, and then performs selection and re-scoring of the inter-frame bounding box sequence. The deduplication mode is to perform Non-maximum Suppression (NMS) processing on all the prediction target bounding boxes of each frame. Meanwhile, Fast seq-NMS only selects and re-scores the boundary box sequences of the first K boundary boxes with ascending scores after the duplication removal, and K is set to be 100 in the embodiment of the invention, so that the calculation burden is greatly reduced compared with the mode adopted by seq-NMS. And the step of selecting and re-scoring the bounding box sequence is the same as that of seq-NMS, after the bounding box sequence with the highest total score is re-scored, the bounding box sequence is excluded, the selection and re-scoring of the prediction target bounding box sequence are restarted, and the steps are repeated in sequence until no sequence can be selected. In the invention, K can be set according to a quantity proportion and can also be set as a fixed value.
As shown in fig. 4, an implementation process of the method of the present invention includes: taking out a current frame to be detected according to a time sequence from an input unmanned aerial vehicle video, scaling the frame to be detected to 512 multiplied by 512 pixels, inputting the frame to be detected into the lightweight small-target detection network, and storing all predicted target bounding boxes; carrying out scale scaling and lightweight small-target detection network detection on all frames of a video to be detected; and then processing the detection results of all frames of the video to be detected by using a rapid sequence non-maximum suppression method, and outputting the target detection result of the video to be detected after duplication removal and correction.

Claims (5)

1. A light unmanned aerial vehicle image small target detection method is characterized in that the following steps are executed for an input unmanned aerial vehicle image video to be detected:
the method comprises the following steps: taking a frame of image according to a time sequence, and zooming the current frame of image to a set size;
step two: inputting the zoomed image into a Revised mobilenetV2 feature extractor, and outputting a feature map with the dimension of 16 multiplied by 16;
removing the last layer of 1 × 1 convolution of the MobileNetV2 feature extractor to form a Revised MobileNetV2 feature extractor;
step three: inputting the extracted feature map into a synchronous up-sampling and detecting module; the synchronous up-sampling and detecting module comprises four branches based on a sub-pixel convolution structure, namely a central point branch, a central point offset branch, a central point target branch and a scale branch, wherein the first three branches jointly determine the position of the central point, and the scale branch determines the scale of each central point corresponding to a target;
the central point target branch outputs a two-classification target heat map, and marks whether each point in the corresponding image is the central point of any interested class target;
step four: obtaining all predicted target frames of the current frame according to the predicted target central point position and the corresponding scale, and storing the result; judging whether all frames of the current video to be detected are detected or not, if so, entering the step five for execution, and if not, returning to the step one for execution;
step five: and performing rapid sequence non-maximum suppression processing on all frame prediction results of the video to be detected to obtain a final target detection result.
2. The method of claim 1, wherein in step one, the image is scaled to a size of 512 x 512 pixels.
3. The method of claim 1, wherein in step three, in the synchronous upsampling and detecting module, at each branch, the feature map is first upscaled by using sub-pixel convolution, and then the upsampled output is obtained by periodic rearrangement.
4. The method according to claim 1, wherein in step three, when the synchronous upsampling and detecting module detects the feature map, the predicted target frame score is obtained by combining the classification score and the target score of each center point, and is represented as follows:
Score=heatmapcls×F(heatmapobj)
Figure FDA0003609967320000011
wherein Score is the target box Score, heatmapclsHeatmap, class-by-class prediction, output for a central point branchobjA binary target heat map output for the central point target branch, f (x) is a preprocessing function of the target score, x represents the value of the pixel points in the heat map.
5. The method according to claim 1, wherein in the fifth step, the fast sequence non-maximum suppression processing is performed, and the fast sequence non-maximum suppression processing comprises: firstly, inhibiting and removing duplication of a predicted target frame, and then selecting and re-scoring a sequence of the predicted target frame; the suppression and the deduplication refer to that the non-maximum suppression processing is carried out on all the prediction target frames of each frame; the sequence selection refers to selecting the first K predicted target frames of each frame after duplicate removal and with descending scores, calculating the IOU value between the predicted target frames of the adjacent front and back frames, linking the predicted target frames with the IOU value larger than a threshold B, obtaining a plurality of overlapped time sequence target frame sequences with different lengths corresponding to the whole video sequence at the moment, and selecting the time sequence target frame sequence with the maximum total score of the target frames; the re-scoring refers to assigning the score average value of the target frames of the sequence to each prediction target frame in the sequence for the selected time sequence target frame sequence; then, excluding the time sequence target frame sequence with the largest total score, and selecting and re-scoring the prediction target frame sequence again until no time sequence target frame sequence can be selected; wherein K is a positive integer and B is a real number greater than 0.
CN202010819487.1A 2020-08-14 2020-08-14 Lightweight unmanned aerial vehicle image small target detection method Active CN112101113B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010819487.1A CN112101113B (en) 2020-08-14 2020-08-14 Lightweight unmanned aerial vehicle image small target detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010819487.1A CN112101113B (en) 2020-08-14 2020-08-14 Lightweight unmanned aerial vehicle image small target detection method

Publications (2)

Publication Number Publication Date
CN112101113A CN112101113A (en) 2020-12-18
CN112101113B true CN112101113B (en) 2022-05-27

Family

ID=73753773

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010819487.1A Active CN112101113B (en) 2020-08-14 2020-08-14 Lightweight unmanned aerial vehicle image small target detection method

Country Status (1)

Country Link
CN (1) CN112101113B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112733630A (en) * 2020-12-28 2021-04-30 深圳市捷顺科技实业股份有限公司 Channel gate detection method, device, equipment and storage medium
CN114912486A (en) * 2022-05-10 2022-08-16 南京航空航天大学 Modulation mode intelligent identification method based on lightweight network
CN115861891B (en) * 2022-12-16 2023-09-29 北京多维视通技术有限公司 Video target detection method, device, equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816695A (en) * 2019-01-31 2019-05-28 中国人民解放军国防科技大学 Target detection and tracking method for infrared small unmanned aerial vehicle under complex background
CN111027547A (en) * 2019-12-06 2020-04-17 南京大学 Automatic detection method for multi-scale polymorphic target in two-dimensional image
CN111401201A (en) * 2020-03-10 2020-07-10 南京信息工程大学 Aerial image multi-scale target detection method based on spatial pyramid attention drive
CN111476252A (en) * 2020-04-03 2020-07-31 南京邮电大学 Computer vision application-oriented lightweight anchor-frame-free target detection method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9881234B2 (en) * 2015-11-25 2018-01-30 Baidu Usa Llc. Systems and methods for end-to-end object detection

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109816695A (en) * 2019-01-31 2019-05-28 中国人民解放军国防科技大学 Target detection and tracking method for infrared small unmanned aerial vehicle under complex background
CN111027547A (en) * 2019-12-06 2020-04-17 南京大学 Automatic detection method for multi-scale polymorphic target in two-dimensional image
CN111401201A (en) * 2020-03-10 2020-07-10 南京信息工程大学 Aerial image multi-scale target detection method based on spatial pyramid attention drive
CN111476252A (en) * 2020-04-03 2020-07-31 南京邮电大学 Computer vision application-oriented lightweight anchor-frame-free target detection method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于视频帧间运动估计的无人机图像车辆检测;陈映雪 等;《北京航空航天大学学报》;20200331;第46卷(第3期);全文 *

Also Published As

Publication number Publication date
CN112101113A (en) 2020-12-18

Similar Documents

Publication Publication Date Title
CN108961235B (en) Defective insulator identification method based on YOLOv3 network and particle filter algorithm
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
CN112257569B (en) Target detection and identification method based on real-time video stream
CN109903331B (en) Convolutional neural network target detection method based on RGB-D camera
CN111079739B (en) Multi-scale attention feature detection method
CN110659664B (en) SSD-based high-precision small object identification method
CN112101113B (en) Lightweight unmanned aerial vehicle image small target detection method
CN113486764B (en) Pothole detection method based on improved YOLOv3
CN110991444B (en) License plate recognition method and device for complex scene
CN111915583B (en) Vehicle and pedestrian detection method based on vehicle-mounted thermal infrared imager in complex scene
CN111079604A (en) Method for quickly detecting tiny target facing large-scale remote sensing image
CN111462140B (en) Real-time image instance segmentation method based on block stitching
CN111126278A (en) Target detection model optimization and acceleration method for few-category scene
CN115937819A (en) Three-dimensional target detection method and system based on multi-mode fusion
CN114781514A (en) Floater target detection method and system integrating attention mechanism
CN118446987A (en) Cabin section inner surface corrosion visual detection method for long and narrow airtight space
CN105184809A (en) Moving object detection method and moving object detection device
CN117975377A (en) High-precision vehicle detection method
CN117011655A (en) Adaptive region selection feature fusion based method, target tracking method and system
CN114219757B (en) Intelligent damage assessment method for vehicle based on improved Mask R-CNN
CN111178158A (en) Method and system for detecting cyclist
CN111008555A (en) Unmanned aerial vehicle image small and weak target enhancement extraction method
CN118115952B (en) All-weather detection method and system for unmanned aerial vehicle image under urban low-altitude complex background
CN117314841A (en) Pixel defect detection method and device based on YOLO
CN118097365A (en) Lightweight object detection network design method for indoor semantic SLAM system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant