Nothing Special   »   [go: up one dir, main page]

CN117557774A - Unmanned aerial vehicle image small target detection method based on improved YOLOv8 - Google Patents

Unmanned aerial vehicle image small target detection method based on improved YOLOv8 Download PDF

Info

Publication number
CN117557774A
CN117557774A CN202311456286.XA CN202311456286A CN117557774A CN 117557774 A CN117557774 A CN 117557774A CN 202311456286 A CN202311456286 A CN 202311456286A CN 117557774 A CN117557774 A CN 117557774A
Authority
CN
China
Prior art keywords
aerial vehicle
unmanned aerial
image
convolution
yolov8
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311456286.XA
Other languages
Chinese (zh)
Inventor
郭小伟
李俊武
陈跃冲
封征
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yangtze River Delta Research Institute of UESTC Huzhou
Original Assignee
Yangtze River Delta Research Institute of UESTC Huzhou
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yangtze River Delta Research Institute of UESTC Huzhou filed Critical Yangtze River Delta Research Institute of UESTC Huzhou
Priority to CN202311456286.XA priority Critical patent/CN117557774A/en
Publication of CN117557774A publication Critical patent/CN117557774A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • G06V20/17Terrestrial scenes taken from planes or by drones
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Remote Sensing (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an unmanned aerial vehicle image small target detection method based on improved YOLOv8, which comprises the steps of collecting and labeling various unmanned aerial vehicle shooting images and establishing an unmanned aerial vehicle image data set; based on a Yolov8 original network, a backbone network ITNet is introduced, dynamic convolution ODConv is used for replacing Conv convolution, a neck module SGFPN is used, a feature fusion module CSF is introduced, a CARAFE up-sampling method is used for replacing nearest neighbor sampling, an improved YOLOv8 network structure is used as an unmanned aerial vehicle image recognition network, a deep learning model for unmanned aerial vehicle small target recognition detection is obtained through training, unmanned aerial vehicle images are detected, and high-accuracy detection of the unmanned aerial vehicle images is achieved.

Description

Unmanned aerial vehicle image small target detection method based on improved YOLOv8
Technical Field
The invention belongs to the technical field of deep learning target detection, and particularly relates to an unmanned aerial vehicle image small target detection method based on improved YOLOv8
Background
In recent years, a target detection algorithm based on a convolutional neural network is widely applied and developed in the fields of remote sensing image processing, unmanned aerial vehicle navigation, automatic driving, medical diagnosis, face recognition, defect detection and the like. Conventional target detection algorithms can basically meet the requirements in various scenes, but the algorithms are mainly faced with large and medium targets, and for small targets of an aerial view of an unmanned aerial vehicle, due to the fact that effective features are few, enough feature information is difficult to extract, and the effect is unsatisfactory. In particular, even the most advanced detectors have a great performance gap in detecting small and medium-sized objects.
Currently popular object detectors typically comprise a backbone network and a detection head, the decision of the latter being dependent on the representation output of the former, which has proven to be effective. However, the small target feature information is originally small, and is hardly reserved after a plurality of downsampling, so that a network can hardly learn useful information, and a detection head cannot make a correct decision, which is fatal to small target detection. Therefore, the detection accuracy of the current detector for the small target of the unmanned aerial vehicle is lower.
Disclosure of Invention
The invention aims to provide an unmanned aerial vehicle small target recognition detection method based on improved YOLOv8, which aims to solve the technical problem of low accuracy in unmanned aerial vehicle picture detection.
In order to solve the technical problems, the specific technical scheme of the unmanned aerial vehicle small target recognition detection method based on the improved YOLOv8 is as follows:
an unmanned aerial vehicle small target recognition detection method based on improved YOLOv8,
step 1, data are obtained from pictures shot by an unmanned aerial vehicle in a real living environment, ten images such as people, vehicles and the like are marked, an unmanned aerial vehicle picture data set is established, and a Mosaic data enhancement method is used for carrying out data enhancement on the data set;
step 2, taking a yolov8 network structure as a reference network, introducing a main network ITNet (Inverted Triangle Net), using a dynamic convolution ODConv to replace Conv convolution, using a neck module SGFPN, introducing a feature fusion module CSF, using a CARAFE up-sampling method to replace nearest neighbor sampling, using the improved yolov5 network structure as an unmanned aerial vehicle small target recognition network, and obtaining a deep learning model of unmanned aerial vehicle small target recognition detection through training;
and step 3, inputting the unmanned aerial vehicle small target image to be detected and identified into a deep learning model for unmanned aerial vehicle small target identification and detection for detection.
Further, the detection network is modified based on the YOLOv8 network structure, and comprises 4C 2f modules, 1 SPPF module, 6 ODConv modules, 7 CSF modules, 7 Concat modules, 3 upsampling modules and 6 conv modules.
Further, the C2f module includes a 3×3 convolution layer, a BN (Batch Normalization) layer, and a SiLU activation function layer, which are sequentially cascaded;
the SPPF module comprises 5 multiplied by 5 global pooling layers which are sequentially cascaded, and the results are spliced through concat;
the Conv module comprises a 1 multiplied by 1 convolution layer, a BN layer and a ReLu activation function layer which are sequentially cascaded;
the ODConv module is represented as
y=(α w1 ⊙α f1 ⊙α c1 ⊙α s1 ⊙W 1 +…+α wn ⊙α fn ⊙α cn ⊙α sn ⊙W n )*x
Wherein x ε R (h x ω x c_in) and y ε R (h x ω x c_out) represent the input and output features, respectively (channel number c_in/c_out, width and height of feature h, ω, respectively), W i Representing an ith convolution kernel consisting of a c_out filter (w_i∈r (kxkxc_in), m=1, …, c_out); x 0_wi×1r represents the attention scalar of the convolution kernel w_i; alpha_si epsilon R (k x k), alpha_ci epsilon R (c_in) and alpha_fi epsilon R (c_out) represent three newly introduced notes, calculated along the spatial dimension, input channel dimension and output channel dimension of the convolution kernel W_i, respectively; x 2 represents multiplication operations along different dimensions of the kernel space.
The CSF module comprises three branches, wherein the first branch is a 3X 3RepConv convolution layer which is sequentially cascaded, the second branch is a PConv module and a Conv module, the third branch is a Conv module, and the outputs of the three branches are spliced through a concat layer;
further, the method for preprocessing the unmanned aerial vehicle image dataset comprises the following steps: the xml file generated using the VOC annotation mode is converted into txt file required for YOLO training.
Further, the data set dividing method comprises the following steps: 60% data was used as training set, 20% data was used as validation set, and 20% data was used as test set.
Further, setting model training parameters, wherein the initial learning rate is 0.01, the momentum is 0.937, the weight attenuation is 0.0005, the training threshold is 0.2, the picture size is normalized to 640×640, the iteration number is 300, and the batch size is 16;
compared with the original YOLOv8 target detection network, the improved YOLOv8 network provided by the invention can realize accurate detection of small target objects under a complex background on the detection task of small targets of an unmanned aerial vehicle, and reduces the parameter quantity and the calculation quantity. Firstly, a trunk which increases the number of the convolution of the shallow extraction features is designed, the extraction of the shallow information by the network is enhanced, the full-dimensional dynamic convolution is utilized for encoding, and the extraction capability of the network to the features of the small target is effectively improved. Secondly, a feature fusion module is provided to further enhance the multi-layer and feature fusion capability of the network. Thirdly, a neck structure is designed, shallow information extraction is increased, and the mining capability of the network on small target position information is enhanced.
According to the method, small targets of the unmanned aerial vehicle with more ground object shielding and complex background in a low-altitude scene are detected, and the manpower and time cost for manually collecting and processing data is reduced through a deep learning method. And the data enhancement mode is utilized to acquire more comprehensive and higher-quality data.
Drawings
FIG. 1 is a schematic and flow chart of the overall architecture of the present invention;
FIG. 2 is a block diagram of a method study of the present invention;
FIG. 3 is a diagram of the improved YOLOv8 network of the present invention;
FIG. 4 is a block diagram of CSF in accordance with the invention;
FIG. 5 is a graph showing the comparison of the changes of the evaluation indexes before and after model improvement;
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
A method for detecting low-speed targets based on edge calculation, as shown in fig. 2, comprises the following steps:
s1, collecting image sets of different targets under different exposure degrees, and processing the image sets to obtain a low-small slow target data set;
specifically, images of different targets under different exposure degrees are acquired through a camera, low and slow targets in the images are marked by a marking tool, so that in order to enhance the generalization, the mosaine and mixup combined data of yolov4 are referenced, the data dimension is enhanced, and the image fuzzy data of different degrees are increased according to different fuzzification of the small targets, so that the detection precision of the fuzzy data is improved.
S2, designing a trunk, enhancing the extraction of shallow information by a network, and encoding by utilizing full-dimensional dynamic convolution.
Specifically, through a large number of experiments, we find that downsampling can improve translational invariance, avoid overfitting, and reduce calculation cost. Small objects occupy very few pixels, while downsampling may remove important features that identify these objects. The only way to save information about small features is to encode these features in the earliest layers using convolution filters and pass this information to the subsequent layers. However, in existing backbones, the number of shallow convolution filters is kept to a minimum to reduce the computational burden, which may result in loss of small target key discrimination features.
The original CSPDarkNet53 reduced the feature map size by a factor of 4 in a 2-layer convolution. The use of such backbones to handle tiny object detection may result in tiny object information disappearing in the feature map before complete extraction. To solve this problem, we propose ITNet. The number of convolution kernels for feature extraction is increased in the shallow layer compared to the original Backbone, while the number of kernels is decreased in the deep layer to improve the computational efficiency. Furthermore, we use the full-dimensional dynamic convolution ODConv at the time of downsampling, so as to preserve the full-dimensional information of the object.
S3, designing a feature fusion module CSF based on upper and lower layers, and designing a neck structure SGFPN, wherein more shallow and high-resolution information is reserved.
Specifically, YOLOv8 uses PANet for feature fusion and top-down and bottom-up feature layers of different dimensions for fusion. The PAN structure in YOLOv8 uses bottom-up paths and lateral connections, the bottom-up paths upsample by a spatially lower resolution but semantically stronger feature map, yielding higher resolution features. These features are enhanced by fusing the lateral connections with features on the same level. Each lateral join incorporates feature maps of the same spatial size from the bottom-up path and the top-down path.
Shallow feature mapping is a lower level of semantics but its activation is more accurately located because it is downsampled less often, so the feature map of this layer is also fused when multi-scale features are acquired and a detection head is added. The performance improvement for small target detection is very significant after adding an additional detection head, although computation and memory costs increase.
In addition, P2 is only down-sampled four times compared to the input picture, which contains much interference information, so we use a feature fusion module to better extract features.
GFPN enhances feature interactions through queen-fusion, but it also brings a large number of additional upsampling and downsampling operations, which are disadvantageous for small targets and which are easily lost during sampling. And the transmission of information is provided from the early node to the later stage through the layer Connection (Skip-layer Connection), but the information almost reaches the subsequent layer through transverse transmission, and the redundant information transmission is generated by continuing to do so, and meanwhile, more parameters and calculation amount are introduced, so that the model efficiency is reduced. In order to further research an effective multi-scale feature fusion method and achieve a better target detection effect, the connection method of the feature fusion layer is improved. The structure adds cross-scale links and uses a modified giraffe feature pyramid network for feature fusion.
The SGFPN of the invention maintains more information of small targets in the upper layer by adding fusion to the features of the upper layer. The system can integrate more features, realize multi-scale feature fusion and obtain a larger receiving domain and an accurate object position. After adding the P2 layer, an up-sampling is added on the P3 layer, the up-sampling is transversely connected with the P2 layer, the F3, F4 and F5 nodes are respectively connected with the P2, P3 and P4 nodes, the N3, N4 and N5 nodes are respectively connected with the F2, F3 and F4 nodes, and the characteristics can be fused better by adding the connections. The final improved structure is shown in figure 3.
The fusion module used in the present invention is CSF (Cross-scale fusion) which is used to fuse incoming multi-layer feature maps. The structure of the CSF module is shown. The original feature fusion module adopts simple channel connection, and only the features are overlapped. To introduce context information and refine the (refine) feature map, we propose a feature fusion module CSF for each scale feature in the k-level
Where Concat () refers to the concatenation of feature maps generated in all previous layers, while Conv1 () represents a 3x3 convolution.
Wherein Conv2 () represents a 1×1 convolution
Basicbolck(P 1 )=Conv1(RepConv(P 1 ))
Where RepConv is typically a convolution block that combines a 3x3 convolution, a 1 x 1 convolution, and an identity mapping in one convolution layer, the structure is shown in fig. 4. The RepConv can learn rich features after one mapping, and is a multi-branch structure, so that the performance can be improved through multiple branches in the training process, and reasoning can be converted into a continuous straight-cylinder structure with 3X3 convolution and ReLU activation functions through structure reparameterization, so that the reasoning speed is accelerated.
Finally, the overcomplete operation truncates the gradient stream to prevent the different layers from learning duplicate gradient information.
Pout=Concat(P 1 ,P 2 ,P 3 ,Basicbolck(P 1 ),(Basicbolck(P 1 )) 2 ,(Basicbolck(P 1 )) 3 )
Wherein, (Basicbolck (P) 1 )) n Representing n basic bolck () connections. PConv refers to depth convolution using a 3x3 kernel size for capturing important local spatial regions for each channel
The CSF module retains the advantages of RepConv characteristic reuse and structure re-parametrization, and simultaneously intercepts gradient flow, prevents excessive repeated gradient information, can well fuse various characteristic diagrams, and accelerates reasoning speed.
In summary, these modules together form a backhaul part in the YOLOv8 network structure, which is used to extract and fuse multi-scale feature information, so as to support accuracy and robustness of the target detection task;
and building a virtual environment for a training model on the GPU server, inputting a training set into the improved yolov8 network structure to perform target detection model training, obtaining a deep learning model for unmanned aerial vehicle image recognition detection after training is completed, inputting a verification set into the deep learning model for unmanned aerial vehicle image recognition detection to perform verification, optimizing the deep learning model according to the effect obtained by verification, and finally obtaining the deep learning model for unmanned aerial vehicle image recognition detection with the best effect.
In one embodiment, a 3×3 convolution and a 1×1 convolution are utilized as the final output module of the YOLOv8 network; and respectively inputting the detected feature maps with three different pixel scales into a YOLO Head for decoding, extracting global features through a 3×3 convolution layer, fully connecting the 1×1 convolution layers, and finally calculating to obtain a prediction boundary box, a confidence value and a category. After the YOLO Head, the loss function value of the detection model is minimized through iterative calculation, and when the training time iteration is completed, the model with the highest detection precision is selected as the final detection model.
Sequentially stacking the modified structures and modules according to the original YOLOv8 network structure form, so as to obtain an improved YOLOv8 network structure; model training parameters include:
the initial learning rate was 0.01, the momentum was set to 0.937, the weight decay was set to 0.0005, the training threshold was 0.2, the picture sizes were all normalized to 640 x 640, the number of iterations was 300, and the batch size was 16.
The data set dividing method comprises the following steps: 60% data was used as training set, 20% data was used as validation set, and 20% data was used as test set.
The bounding box loss is calculated using CIoU, the loss of objects and categories is calculated using cross entropy, and back propagation update model parameters are performed.
It should be emphasized that the examples described herein are illustrative rather than limiting, and therefore the invention includes, but is not limited to, the examples described in the detailed description, as other embodiments derived from the technical solutions of the invention by a person skilled in the art are equally within the scope of the invention.
It will be understood that the invention has been described in terms of several embodiments, and that various changes and equivalents may be made to these features and embodiments by those skilled in the art without departing from the spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.

Claims (5)

1. An unmanned aerial vehicle image small target detection method based on improved YOLOv8 is characterized by comprising the following steps:
step 1, data are obtained from pictures shot by an unmanned aerial vehicle in a real living environment, ten images such as people, vehicles and the like are marked, an unmanned aerial vehicle picture data set is established, and a Mosaic data enhancement method is used for carrying out data enhancement on the data set;
step 2, taking a yolov8 network structure as a reference network, introducing a main network ITNet (Inverted Triangle Net), using a dynamic convolution ODConv to replace Conv convolution, using a neck module SGFPN, introducing a feature fusion module CSF, using a CARAFE up-sampling method to replace nearest neighbor sampling, using the improved yolov8 network structure as an unmanned aerial vehicle image recognition network, and obtaining a deep learning model of unmanned aerial vehicle small target recognition detection through training;
and step 3, inputting the unmanned aerial vehicle small target image to be detected and identified into a deep learning model for unmanned aerial vehicle small target identification and detection for detection.
2. The unmanned aerial vehicle image small target detection method based on improved YOLOv8 of claim 1, wherein the method comprises the following steps of: the specific implementation method of the step 1 comprises the following steps:
step 1.1, acquiring an image shot by an unmanned aerial vehicle in a real environment through a mobile terminal instrument; labeling information on the acquired image by using a LabelImg tool;
and 1.2, performing data enhancement on the data set by using a Mosaic data enhancement method, and establishing an unmanned aerial vehicle image data set.
3. The unmanned aerial vehicle image small target detection method based on improved YOLOv8 of claim 2, wherein the method comprises the following steps of: the Mosaic data augmentation processing method performs a series of image processing operations on a given image file, including randomly using a plurality of pictures, randomly scaling, randomly distributing, stitching, cropping the image, horizontally turning the image, rotating the image by 90 degrees, reducing the brightness of the image, improving the brightness of the image, performing blurring processing on the image, adding salt and pepper noise into the image, and adding Gaussian noise into the image.
4. The unmanned aerial vehicle image small target detection method based on improved YOLOv8 of claim 1, wherein the method comprises the following steps of: the specific implementation method of the step 2 comprises the following steps: the number of convolution kernels for feature extraction is increased in the shallow layer, while the number of kernels is decreased in the deep layer to improve computational efficiency. Furthermore, we use the full-dimensional dynamic convolution ODConv at the time of downsampling, so as to preserve the full-dimensional information of the object.
5. The unmanned aerial vehicle image small target detection method based on improved YOLOv8 of claim 1, wherein the method comprises the following steps of: the specific implementation method of the step 2 comprises the following steps of:
the fusion module used in the present invention is CSF (Cross-scale fusion) which is used to fuse incoming multi-layer feature maps. For each scale feature in the k-level
Where Concat () refers to the concatenation of feature maps generated in all previous layers, while Conv1 () represents a 3x3 convolution.
Wherein Conv2 () represents a 1×1 convolution
Basicbolck(P 1 )=Conv1(RepConv(P 1 ))
Among these, repConv is typically a convolution block that combines a 3x3 convolution, a 1 x 1 convolution, and an identity mapping in one convolution layer.
Finally, the overcomplete operation truncates the gradient stream to prevent the different layers from learning duplicate gradient information.
Pout=Concat(P 1 ,P 2, P 3 ,Basicbolck(P 1 ),(Basicbolck(P 1 )) 2 ,(Basicbolck(P 1 )) 3 )
Wherein, (Basicbolck (P) 1 )) n Representing n basic bolck () connections. PConv refers to a depth convolution using a 3x3 kernel size for capturing important local spatial regions for each channel.
CN202311456286.XA 2023-11-03 2023-11-03 Unmanned aerial vehicle image small target detection method based on improved YOLOv8 Pending CN117557774A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311456286.XA CN117557774A (en) 2023-11-03 2023-11-03 Unmanned aerial vehicle image small target detection method based on improved YOLOv8

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311456286.XA CN117557774A (en) 2023-11-03 2023-11-03 Unmanned aerial vehicle image small target detection method based on improved YOLOv8

Publications (1)

Publication Number Publication Date
CN117557774A true CN117557774A (en) 2024-02-13

Family

ID=89813819

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311456286.XA Pending CN117557774A (en) 2023-11-03 2023-11-03 Unmanned aerial vehicle image small target detection method based on improved YOLOv8

Country Status (1)

Country Link
CN (1) CN117557774A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118247581A (en) * 2024-05-23 2024-06-25 中国科学技术大学 Method and device for labeling and analyzing gestures of key points of animal images
CN118658047A (en) * 2024-08-20 2024-09-17 成都唐源电气股份有限公司 Small target detection method based on improved YOLOv model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118247581A (en) * 2024-05-23 2024-06-25 中国科学技术大学 Method and device for labeling and analyzing gestures of key points of animal images
CN118658047A (en) * 2024-08-20 2024-09-17 成都唐源电气股份有限公司 Small target detection method based on improved YOLOv model

Similar Documents

Publication Publication Date Title
CN110135366B (en) Shielded pedestrian re-identification method based on multi-scale generation countermeasure network
CN113298818B (en) Remote sensing image building segmentation method based on attention mechanism and multi-scale features
CN110956094B (en) RGB-D multi-mode fusion personnel detection method based on asymmetric double-flow network
CN115331087B (en) Remote sensing image change detection method and system fusing regional semantics and pixel characteristics
CN109934200B (en) RGB color remote sensing image cloud detection method and system based on improved M-Net
CN112150493B (en) Semantic guidance-based screen area detection method in natural scene
CN113052210A (en) Fast low-illumination target detection method based on convolutional neural network
CN117557774A (en) Unmanned aerial vehicle image small target detection method based on improved YOLOv8
CN111666842B (en) Shadow detection method based on double-current-cavity convolution neural network
Delibasoglu et al. Improved U-Nets with inception blocks for building detection
CN114519819B (en) Remote sensing image target detection method based on global context awareness
CN113516126A (en) Adaptive threshold scene text detection method based on attention feature fusion
CN105243154A (en) Remote sensing image retrieval method and system based on significant point characteristics and spare self-encodings
CN112686828B (en) Video denoising method, device, equipment and storage medium
CN113111740A (en) Characteristic weaving method for remote sensing image target detection
CN112149526B (en) Lane line detection method and system based on long-distance information fusion
CN114926734B (en) Solid waste detection device and method based on feature aggregation and attention fusion
Liu et al. CAFFNet: channel attention and feature fusion network for multi-target traffic sign detection
CN116071676A (en) Infrared small target detection method based on attention-directed pyramid fusion
CN115861756A (en) Earth background small target identification method based on cascade combination network
CN117197462A (en) Lightweight foundation cloud segmentation method and system based on multi-scale feature fusion and alignment
CN117726954A (en) Sea-land segmentation method and system for remote sensing image
CN117994573A (en) Infrared dim target detection method based on superpixel and deformable convolution
CN112926667A (en) Method and device for detecting saliency target of depth fusion edge and high-level feature
CN117392508A (en) Target detection method and device based on coordinate attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination