CN113378763B

CN113378763B - SAR image-oriented target automatic detection method and system

Info

Publication number: CN113378763B
Application number: CN202110712235.3A
Authority: CN
Inventors: 肖红光; 王杰岚; 陈立福; 邢进; 韩松涛; 罗汝; 蔡兴敏
Original assignee: Changsha University of Science and Technology
Current assignee: Changsha University of Science and Technology
Priority date: 2021-06-25
Filing date: 2021-06-25
Publication date: 2022-11-22
Anticipated expiration: 2041-06-25
Also published as: CN113378763A

Abstract

The invention discloses an automatic target detection method and system for SAR images, including inputting an input image into a pre-trained feature fusion and anti-interference network FFAN to obtain a target frame for target detection, and the feature fusion and anti-interference network FFAN includes : Backbone network, pooling and channel shuffling module, deep fusion module, spatial attention mechanism RSAM, spatial feature fusion module ASFF and classification regression network. The invention can effectively extract and fuse features and improve regression accuracy, so that the feature fusion and anti-interference network FFAN can significantly improve the detection effect under the premise that the test speed is hardly affected. The method of the invention has the advantage of good versatility, not only can be used to realize high-precision and fast detection of aircraft, but also can be used for detection and recognition of other dense targets in SAR images.

Description

A method and system for automatic target detection in SAR images

技术领域technical field

本发明涉及基于SAR图像的目标检测技术，具体涉及一种面向SAR图像的目标自动检测方法及系统。The invention relates to a target detection technology based on SAR images, in particular to an automatic target detection method and system for SAR images.

背景技术Background technique

目标检测是高分辨合成孔径雷达(SAR)领域的热点问题。飞机是一种典型的人造目标，在军用领域，飞机具有重要的军事价值，高效、准确获取机场、空域等位置的飞机目标信息，对实时获取战场军事情报、制定作战计划具有重要意义；在民用领域，飞机的检测识别可以更加有效地对机场进行管理。因合成孔径雷达(SAR)具有全天时、全天候工作的特性，SAR图像中的目标检测与识别技术一直受到广泛关注和研究。Target detection is a hot issue in the field of high-resolution synthetic aperture radar (SAR). Aircraft is a typical man-made target. In the military field, aircraft has important military value. Efficient and accurate acquisition of aircraft target information in airports, airspace and other locations is of great significance for real-time acquisition of battlefield military intelligence and formulation of combat plans; In the field, the detection and identification of aircraft can manage the airport more effectively. Because Synthetic Aperture Radar (SAR) has the characteristics of all-day and all-weather work, the target detection and recognition technology in SAR images has been widely concerned and researched.

传统的SAR图像目标检测方法如恒虚警率算法(Constant False Alarm Rate,CFAR)和CA-CFAR算法，存在鲁棒性差和自动化程度不高等难以攻破的问题。Traditional SAR image target detection methods, such as Constant False Alarm Rate (CFAR) and CA-CFAR algorithms, have problems such as poor robustness and low degree of automation that are difficult to break through.

近年来，合成孔径雷达成像及深度学习技术迅速发展，人们获取到了更多高分辨率的SAR图像，许多学者开始研究利用深度学习算法进行SAR图像目标检测。与传统SAR图像目标检测方法和机器学习算法相比，具有更高的检测精度和更短的测试时间，取得了很大的研究进展，尤其是可以实现端对端的检测。飞机目标具有尺寸小，类型多样，分布密集等特点，与跑道，桥梁建筑物等目标检测相比难度更大。因此，目标检测是SAR图像目标检测中的难点。在2019年Li等人将改进的线段检测器LSD与Faster-CNN[4]相结合设计了一种SAR图像中的飞机检测方法。为了更好的实现目标检测，越来越多的学者引入了注意力机制。Zhao等人设计一个金字塔注意扩展网络pyramid attention dilated network(PADN)增强了飞机的后向散射细粒度特征的学习。2020年，Guo等人提出了一种检测飞机的方法，该方法通过将散射信息与深度学习相结合来检测飞机目标。Chen等人提出了多级密集连接双注意力网络来自动检测机场跑道区，取得了比较好的提取结果，但该网络训练速度和测试速度相对较慢。2020年，Chen等人提出了一种基于深度学习的新的端到端框架，可以自动对SAR图像中的水和阴影区域进行分类。Chen等人提出了一种新的场景分类框架，称为具有多尺度空间特征的特征重新校准网络(FRN-MSF)，实现了SAR的场景下高精度分类。Tan等人提出了地理空间上下文注意力机制GCAM对机场区进行自动检测，不仅精度高，而且测试时间较短。2021年，Wang等人将GCAM算法与飞机检测算法相结合，提出了一种高效的SAR图像飞机检测框架，实现了自动且快速地检测飞机目标。In recent years, synthetic aperture radar imaging and deep learning technology have developed rapidly, and people have obtained more high-resolution SAR images. Many scholars have begun to study the use of deep learning algorithms for SAR image target detection. Compared with traditional SAR image target detection methods and machine learning algorithms, it has higher detection accuracy and shorter test time, and has made great research progress, especially end-to-end detection can be realized. Aircraft targets have the characteristics of small size, various types, and dense distribution, which is more difficult than target detection such as runways, bridges and buildings. Therefore, target detection is a difficult point in SAR image target detection. In 2019, Li et al. combined the improved line segment detector LSD with Faster-CNN [4] to design an aircraft detection method in SAR images. In order to better achieve target detection, more and more scholars have introduced the attention mechanism. Zhao et al. designed a pyramid attention dilated network pyramid attention dilated network (PADN) to enhance the learning of aircraft's backscatter fine-grained features. In 2020, Guo et al. proposed a method for detecting aircraft by combining scattering information with deep learning to detect aircraft targets. Chen et al. proposed a multi-level densely connected dual-attention network to automatically detect airport runway areas, and achieved better extraction results, but the network training speed and testing speed were relatively slow. In 2020, Chen et al. proposed a new end-to-end framework based on deep learning to automatically classify water and shadow areas in SAR images. Chen et al. proposed a new framework for scene classification called Feature Recalibration Network with Multi-Scale Spatial Features (FRN-MSF), which achieved high-precision classification for SAR scenes. Tan et al. proposed a geospatial contextual attention mechanism GCAM to automatically detect the airport area, which not only has high accuracy, but also has a short test time. In 2021, Wang et al. combined the GCAM algorithm with the aircraft detection algorithm and proposed an efficient SAR image aircraft detection framework, which realized automatic and rapid detection of aircraft targets.

EWFAN是一种高效的SAR图像目标检测网络，在精度和速度方面都具有相当的优势。然而，由于SAR图像背景复杂和飞机类型多样等原因，使用EWFAN算法检测飞机仍存在问题，检测率和虚警率都没有达到令人满意的水平。EWFAN is an efficient SAR image target detection network, which has considerable advantages in both accuracy and speed. However, due to the complex SAR image background and various types of aircraft, there are still problems in using the EWFAN algorithm to detect aircraft, and the detection rate and false alarm rate have not reached a satisfactory level.

发明内容Contents of the invention

本发明要解决的技术问题：针对现有技术的上述问题，提供一种The technical problem to be solved by the present invention: aiming at the above-mentioned problems of the prior art, a kind of

为了解决上述技术问题，本发明采用的技术方案为：In order to solve the problems of the technologies described above, the technical solution adopted in the present invention is:

一种面向SAR图像的目标自动检测方法，包括将输入图像输入预先完成训练的特征融合与抗干扰网络FFAN得到目标检测的目标框，所述特征融合与抗干扰网络FFAN包括：A target automatic detection method for SAR images, comprising inputting an input image into a pre-trained feature fusion and anti-jamming network FFAN to obtain a target frame for target detection, the feature fusion and anti-jamming network FFAN comprising:

主干网络，用于将输入图像下采样为不同尺度的特征，包括低级特征P3、中间级特征P4～P6和高级特征P7；The backbone network is used to downsample the input image into features of different scales, including low-level features P3, intermediate-level features P4-P6 and high-level features P7;

池化和通道混洗模块，用于分别对中间级特征P4～P6进行提取与融合分别得到对应的混洗特征P4′～P6’；The pooling and channel shuffling modules are used to extract and fuse the intermediate-level features P4~P6 respectively to obtain the corresponding shuffling features P4'~P6';

深度融合模块，用于对低级特征P3、混洗特征P4′～P6′、高级特征P7进行深度融合，分别得到对应的深度融合后的低级特征P3′、混洗特征P4″～P6″、高级特征P7′；The deep fusion module is used to perform deep fusion on the low-level features P3, shuffled features P4′～P6′, and high-level features P7, and respectively obtain the corresponding deep-fused low-level features P3′, shuffled features P4″～P6″, high-level features Feature P7';

空间注意力机制RSAM，分别用于将深度融合后的混洗特征P6″、高级特征P7′进行进一步的空间特征提取与融合；The spatial attention mechanism RSAM is used to further extract and fuse the shuffled features P6″ and advanced features P7′ after deep fusion;

空间特征融合模块ASFF，分别用于将深度融合后的低级特征P3′、混洗特征P4″～P5″进行进一步的空间特征提取与融合；The spatial feature fusion module ASFF is used for further spatial feature extraction and fusion of the deep-fused low-level features P3′ and shuffled features P4″～P5″ respectively;

分类回归网络，用于将空间注意力机制RSAM以及空间特征融合模块ASFF两者输出的所有空间特征提取与融合进行分类回归生成目标检测的目标框。The classification and regression network is used to extract and fuse all the spatial features output by the spatial attention mechanism RSAM and the spatial feature fusion module ASFF for classification and regression to generate a target frame for target detection.

可选地，所述主干网络为Efficientnet网络，所述Efficientnet网络分别针对512×512的输入图像依次进行其中不同尺寸的下采样分别得到对应的特征P1～P7，且将第3～7次～下采样得到的特征分别作为低级特征P3、中间级特征P4～P6和高级特征P7。Optionally, the backbone network is an Efficientnet network, and the Efficientnet network sequentially performs down-sampling of different sizes on the input image of 512×512 to obtain corresponding features P1-P7 respectively, and the 3rd to 7th times to the next The features obtained by sampling are respectively regarded as low-level features P3, intermediate-level features P4-P6, and high-level features P7.

可选地，所述池化和通道混洗模块包括混洗模块PSM-l、混洗模块PSM-m、混洗模块PSM-s共三种混洗模块，混洗模块PSM-l用于对中间级特征P4进行提取与融合分别得到对应的混洗特征P4′，混洗模块PSM-m用于对中间级特征P5进行提取与融合分别得到对应的混洗特征P5′，混洗模块PSM-s用于对中间级特征P6进行提取与融合分别得到对应的混洗特征P6′，所述混洗模块PSM-l、混洗模块PSM-m、混洗模块PSM-s均包括：Optionally, the pooling and channel shuffling modules include a total of three shuffling modules, the shuffling module PSM-1, the shuffling module PSM-m, and the shuffling module PSM-s, and the shuffling module PSM-1 is used for The intermediate-level feature P4 is extracted and fused to obtain the corresponding shuffling feature P4′, and the shuffling module PSM-m is used to extract and fuse the intermediate-level feature P5 to obtain the corresponding shuffling feature P5′, and the shuffling module PSM- s is used to extract and fuse the intermediate-level features P6 to obtain corresponding shuffling features P6' respectively, and the shuffling module PSM-1, the shuffling module PSM-m, and the shuffling module PSM-s all include:

通道分离模块，用于将输入特征的通道一分为二；The channel separation module is used to divide the channel of the input feature into two;

最大池化操作模块，用于将通道分离模块输出的一部分通道进行最大池化操作；The maximum pooling operation module is used to perform a maximum pooling operation on a part of the channels output by the channel separation module;

平均池化操作模块，用于将通道分离模块输出的另一部分通道进行平均池化操作；The average pooling operation module is used to perform an average pooling operation on another part of the channels output by the channel separation module;

拼接模块，用于将最大池化操作模块的输出、平均池化操作模块的输出和输入通道分离模块的原始输入特征拼接；A splicing module for splicing the output of the maximum pooling operation module, the output of the average pooling operation module and the original input feature of the input channel separation module;

混洗模块，用于将拼接模块的输出进行混洗操作以丰富各个通道的信息；The shuffling module is used for shuffling the output of the splicing module to enrich the information of each channel;

卷积模块，用于将混洗模块的输出进行卷积操作使输出特征通道数与输入特征相同；The convolution module is used to perform a convolution operation on the output of the shuffling module so that the number of output feature channels is the same as the input feature;

所述混洗模块PSM-l、混洗模块PSM-m、混洗模块PSM-s三者进行最大池化操作和平均池化操作的池化核尺寸分别为{9,15,25}、{5,11}和{3}。The pooling kernel sizes of the shuffling module PSM-1, the shuffling module PSM-m, and the shuffling module PSM-s for the maximum pooling operation and the average pooling operation are respectively {9, 15, 25}, { 5,11} and {3}.

可选地，所述深度融合模块为BiFPN的改进结构，该改进结构在BiFPN的基础上加入了四处跳跃连接，第一处跳跃连接中间级特征P6与混洗特征P5′，第二处跳跃连接中间级特征P5与混洗特征P4′，第三处跳跃连接中间级特征P5与中间级特征P7，第四处跳跃连接中间级特征P4与中间级特征P6。Optionally, the deep fusion module is an improved structure of BiFPN, which adds four skip connections on the basis of BiFPN, the first jump connects the intermediate-level feature P6 and the shuffling feature P5′, and the second jump connects The intermediate-level feature P5 and the shuffling feature P4', the third jump connects the intermediate-level feature P5 and the intermediate-level feature P7, and the fourth jump connects the intermediate-level feature P4 and the intermediate-level feature P6.

可选地，所述分类回归网络包括分类网络和回归网络，所述分类网络采用3次64通道的卷积和1次n_b×n_c的卷积以预测每一个目标框的类别，所述回归网络采用3次64通道的卷积和1次n_b×4的卷积以用于预测每一个先验框的回归情况，其中n_b指的是该特征层所拥有的先验框数量，n_c指的是网络目标检测类别数。Optionally, the classification regression network includes a classification network and a regression network, the classification network uses three convolutions of 64 channels and one convolution of n_b×n_c to predict the category of each target frame, and the regression network Three 64-channel convolutions and one n_b×4 convolution are used to predict the regression of each prior frame, where n_b refers to the number of prior frames owned by the feature layer, and n_c refers to Number of categories for network object detection.

可选地，所述将输入图像输入预先完成训练的特征融合与抗干扰网络FFAN得到目标检测的目标框之前还包括预先训练特征融合与抗干扰网络FFAN的步骤，且训练特征融合与抗干扰网络FFAN时采用的损失函数L_AIF为：Optionally, before inputting the input image into the pre-trained feature fusion and anti-jamming network FFAN to obtain the target frame of the target detection, it also includes the step of pre-training the feature fusion and anti-jamming network FFAN, and training the feature fusion and anti-jamming network The loss function _LAIF used in FFAN is:

L_AIF＝L_FL+L_AIoU，L _AIF =L _FL +L _AIoU ,

上式中，L_FL为Focal loss分类损失，L_AIoU为AIoU回归损失，且有：In the above formula, L _FL is Focal loss classification loss, L _AIoU is AIoU regression loss, and there are:

L_AIoU＝L_CIoU+pw，

L _AIoU = L _CIoU + pw,

上式中，L_CIoU为CIoU回归损失，a和a^t分别表示预测框和真实框的面积，p为面积的权重函数，w为面积的相似性，IoU为交并比。In the above formula, L _CIoU is the CIoU regression loss, a and a ^t represent the area of the predicted frame and the real frame, respectively, p is the weight function of the area, w is the similarity of the area, and IoU is the intersection and union ratio.

此外，本发明还提供一种面向SAR图像的目标自动检测方法，包括：In addition, the present invention also provides an automatic target detection method for SAR images, including:

S1)将输入SAR图像以带重叠区域的滑窗的方式生成窗口图像；S1) The input SAR image is generated as a window image in the form of a sliding window with overlapping regions;

S2)将各窗口图像作为输入图像，采用所述面向SAR图像的目标自动检测方法的步骤进行目标检测得到目标框；S2) using each window image as an input image, using the steps of the SAR image-oriented target automatic detection method to perform target detection to obtain a target frame;

S3)将各窗口图像的目标框坐标转换到原SAR图像中；S3) converting the target frame coordinates of each window image into the original SAR image;

S4)过滤原SAR图像中重叠的目标框，得到作为最终目标检测结果的目标框。S4) Filter the overlapping target frames in the original SAR image to obtain the target frame as the final target detection result.

可选地，步骤S4)中过滤原SAR图像中重叠的目标框采用的方法为非极大值抑制方法。Optionally, the method used to filter the overlapping target frames in the original SAR image in step S4) is a non-maximum value suppression method.

此外，本发明还提供一种面向SAR图像的目标自动检测系统，包括相互连接的微处理器和存储器，该微处理器被编程或配置以执行所述面向SAR图像的目标自动检测方法的步骤。In addition, the present invention also provides a SAR image-oriented target automatic detection system, which includes interconnected microprocessors and memory, the microprocessor is programmed or configured to execute the steps of the SAR image-oriented target automatic detection method.

此外，本发明还提供一种计算机可读存储介质，该计算机可读存储介质中存储有被编程或配置以执行所述面向SAR图像的目标自动检测方法的计算机程序。In addition, the present invention also provides a computer-readable storage medium, where a computer program programmed or configured to execute the SAR image-oriented automatic target detection method is stored.

和现有技术相比，本发明具有下述优点：Compared with the prior art, the present invention has the following advantages:

1、本发明方法采用的特征融合与抗干扰网络FFAN包括：主干网络，用于将输入图像下采样为不同尺度的特征，包括低级特征P3、中间级特征P4～P6和高级特征P7；池化和通道混洗模块，用于分别对中间级特征P4～P6进行提取与融合分别得到对应的混洗特征P4′～P6’；深度融合模块，用于对低级特征P3、混洗特征P4′～P6′、高级特征P7进行深度融合，分别得到对应的深度融合后的低级特征P3′、混洗特征P4″～P6″、高级特征P7′；空间注意力机制RSAM，分别用于将深度融合后的混洗特征P6″、高级特征P7′进行进一步的空间特征提取与融合；空间特征融合模块ASFF，分别用于将深度融合后的低级特征P3′、混洗特征P4″～P5″进行进一步的空间特征提取与融合；分类回归网络，用于将空间注意力机制RSAM以及空间特征融合模块ASFF两者输出的所有空间特征提取与融合进行分类回归生成目标检测的目标框。通过上述结构，可以有效地提取与融合特征以及提高回归精度，使得特征融合与抗干扰网络FFAN在几乎不影响测试速度地前提下显著提升了检测效果。1. The feature fusion and anti-interference network FFAN adopted by the method of the present invention includes: a backbone network for down-sampling the input image into features of different scales, including low-level features P3, intermediate-level features P4-P6, and high-level features P7; pooling The channel shuffling module is used to extract and fuse the intermediate-level features P4~P6 to obtain the corresponding shuffled features P4'~P6' respectively; the deep fusion module is used to extract and fuse the low-level features P3 and shuffled features P4'~ P6' and advanced features P7 are deeply fused to obtain the corresponding low-level features P3', shuffled features P4"~P6", and high-level features P7' after deep fusion respectively; the spatial attention mechanism RSAM is used to combine deep fusion The shuffled features P6″ and high-level features P7′ are used for further spatial feature extraction and fusion; the spatial feature fusion module ASFF is used to further further integrate the low-level features P3′ and shuffled features P4″~P5″ after deep fusion. Spatial feature extraction and fusion; classification and regression network, used to extract and fuse all spatial features output by the spatial attention mechanism RSAM and spatial feature fusion module ASFF for classification and regression to generate target frames for target detection.Through the above structure, it can be effectively The extraction and fusion of features and the improvement of regression accuracy make the feature fusion and anti-interference network FFAN significantly improve the detection effect without affecting the test speed.

2、本发明方法具有通用性好的优点，不仅可以用于实现飞机的高精度快速检测，也可以用于SAR图像中其他密集目标的检测与识别。2. The method of the present invention has the advantage of good versatility, and can not only be used to realize high-precision and fast detection of aircraft, but also can be used for detection and recognition of other dense targets in SAR images.

附图说明Description of drawings

图1为本发明实施例中特征融合与抗干扰网络FFAN的结构示意图。FIG. 1 is a schematic structural diagram of a feature fusion and anti-interference network FFAN in an embodiment of the present invention.

图2为本发明实施例中特征融合与抗干扰网络FFAN的应用整理方法的原理图。FIG. 2 is a schematic diagram of an application arrangement method of feature fusion and anti-interference network FFAN in an embodiment of the present invention.

图3为本发明实施例中池化和通道混洗模块的结构示意图。Fig. 3 is a schematic structural diagram of a pooling and channel shuffling module in an embodiment of the present invention.

图4为本发明实施例中深度融合模块的结构改进示意图。Fig. 4 is a schematic diagram of structural improvement of the deep fusion module in the embodiment of the present invention.

图5为本发明实施例中分类回归网络的结构示意图Fig. 5 is a schematic structural diagram of the classification and regression network in the embodiment of the present invention

图6为本发明实施例中CIoU Loss存在的问题原理分析图。Fig. 6 is a principle analysis diagram of problems existing in CIoU Loss in the embodiment of the present invention.

具体实施方式Detailed ways

如图1所示，本实施例面向SAR图像的目标自动检测方法包括将输入图像输入预先完成训练的特征融合与抗干扰网络FFAN(Feature Fusion and Anti-interferenceNetwork)得到目标检测的目标框，特征融合与抗干扰网络FFAN包括：As shown in Figure 1, the automatic target detection method for SAR images in this embodiment includes inputting the input image into the pre-trained feature fusion and anti-interference network FFAN (Feature Fusion and Anti-interference Network) to obtain the target frame for target detection, and the feature fusion FFAN with anti-jamming network includes:

参见图1，本实施例中主干网络为Efficientnet网络，Efficientnet网络分别针对512×512的输入图像依次进行其中不同尺寸的下采样分别得到对应的特征P1～P7，且将第3～7次～下采样得到的特征分别作为低级特征P3、中间级特征P4～P6和高级特征P7。Efficientnet网络由谷歌团队提出，它将MobileNet V2中的MBConv作为模型的主要组成部分，并引入了SENet中的squeeze and excitation(SE)方法对网络结构进行调整。Efficientnet网络在精度和速度方面都具有一定的优势，适合作为主干网络用于高效的目标检测算法。Referring to Fig. 1, the backbone network in this embodiment is the Efficientnet network, and the Efficientnet network performs successive downsampling on the input image of 512×512 to obtain the corresponding features P1-P7 respectively, and the 3rd to 7th times to the next The features obtained by sampling are respectively regarded as low-level features P3, intermediate-level features P4-P6, and high-level features P7. The Efficientnet network was proposed by the Google team. It uses MBConv in MobileNet V2 as the main component of the model, and introduces the squeeze and excitation (SE) method in SENet to adjust the network structure. The Efficientnet network has certain advantages in terms of accuracy and speed, and is suitable as a backbone network for efficient target detection algorithms.

池化和通道混洗模块用于进一步提取有效特征。如图3所示，本实施例中池化和通道混洗模块包括混洗模块PSM-l、混洗模块PSM-m、混洗模块PSM-s共三种混洗模块，混洗模块PSM-l用于对中间级特征P4进行提取与融合分别得到对应的混洗特征P4′，混洗模块PSM-m用于对中间级特征P5进行提取与融合分别得到对应的混洗特征P5′，混洗模块PSM-s用于对中间级特征P6进行提取与融合分别得到对应的混洗特征P6′，如图3所示，混洗模块PSM-l、混洗模块PSM-m、混洗模块PSM-s均包括：Pooling and channel shuffling modules are used to further extract effective features. As shown in Figure 3, the pooling and channel shuffling modules in this embodiment include three shuffling modules: shuffling module PSM-1, shuffling module PSM-m, shuffling module PSM-s, and shuffling module PSM- l is used to extract and fuse the intermediate-level features P4 to obtain the corresponding shuffled features P4′, and the shuffled module PSM-m is used to extract and fuse the intermediate-level features P5 to obtain the corresponding shuffled features P5′ respectively. The shuffling module PSM-s is used to extract and fuse the intermediate-level features P6 to obtain the corresponding shuffling features P6′ respectively, as shown in Figure 3, the shuffling module PSM-1, the shuffling module PSM-m, and the shuffling module PSM -s both include:

混洗模块PSM-l、混洗模块PSM-m、混洗模块PSM-s三者进行最大池化操作和平均池化操作的池化核尺寸分别为{9,15,25}、{5,11}和{3}。The pooling kernel sizes of the shuffling module PSM-l, the shuffling module PSM-m, and the shuffling module PSM-s for the maximum pooling operation and the average pooling operation are {9,15,25}, {5, 11} and {3}.

深度融合模块用于将五层特征图特征进行深度融合，促进特征层和通道之间的信息交流，以提高检测效果。如图4所示，深度融合模块为BiFPN的改进结构，该改进结构在BiFPN的基础上加入了四处跳跃连接(如图4中子图(c)中的虚线箭头所示)，第一处跳跃连接中间级特征P6与混洗特征P5′，第二处跳跃连接中间级特征P5与混洗特征P4′，第三处跳跃连接中间级特征P5与中间级特征P7，第四处跳跃连接中间级特征P4与中间级特征P6。参见图4，本实施例中深度融合模块记为M-BiFPN，M-BiFPN为对BiFPN的改进，而BiFPN为FPN的改进。BiFPN(如图4(b))不仅有自上而下的融合，也有自下而上的融合，并且BiFPN移除了只有一个输入来源的点，与FPN(如图4(a))相比，精度和效率都有一定的提升。本实施例中在BiFPN的基础上加入了四处跳跃连接，将每层的原始特征图与输出相加，这样可以融合更多的特征，来提升网络性能。The deep fusion module is used to deeply fuse the features of the five-layer feature map to promote the information exchange between the feature layer and the channel to improve the detection effect. As shown in Figure 4, the deep fusion module is an improved structure of BiFPN, which adds four jump connections on the basis of BiFPN (as shown by the dotted arrow in the sub-figure (c) in Figure 4), the first jump Connect the intermediate level feature P6 and the shuffled feature P5′, the second jump connects the intermediate level feature P5 and the shuffled feature P4′, the third jump connects the intermediate level feature P5 and the intermediate level feature P7, and the fourth jump connects the intermediate level Feature P4 and intermediate level feature P6. Referring to FIG. 4 , the deep fusion module in this embodiment is recorded as M-BiFPN, M-BiFPN is an improvement of BiFPN, and BiFPN is an improvement of FPN. BiFPN (as shown in Figure 4(b)) not only has top-down fusion, but also bottom-up fusion, and BiFPN removes points with only one input source, compared with FPN (as shown in Figure 4(a)) , the accuracy and efficiency have been improved to a certain extent. In this embodiment, four skip connections are added on the basis of BiFPN, and the original feature map of each layer is added to the output, so that more features can be fused to improve network performance.

在深度融合模块完成初步特征融合的基础上，本实施例中采用空间注意力机制RSAM和空间特征融合模块ASFF进一步的空间特征提取与融合。空间注意力机制RSAM将深度融合后的混洗特征P6″、高级特征P7′进行进一步的空间特征提取与融合；空间特征融合模块ASFF将深度融合后的低级特征P3′、混洗特征P4″～P5″进行进一步的空间特征提取与融合。采用空间注意力机制RSAM和空间特征融合模块ASFF进一步的空间特征提取与融合，它们能增强目标显著性以及有效抑制负样本的干扰，并且对网络测试速度地影响很小。On the basis of the preliminary feature fusion completed by the deep fusion module, in this embodiment, the spatial attention mechanism RSAM and the spatial feature fusion module ASFF are used for further spatial feature extraction and fusion. The spatial attention mechanism RSAM performs further spatial feature extraction and fusion on the deeply fused shuffling features P6″ and high-level features P7′; the spatial feature fusion module ASFF takes the deeply fused low-level features P3′ and shuffling features P4″～ P5″ for further spatial feature extraction and fusion. Using the spatial attention mechanism RSAM and the spatial feature fusion module ASFF for further spatial feature extraction and fusion, they can enhance the saliency of the target and effectively suppress the interference of negative samples, and test the speed of the network little impact.

最终，特征图经过采用空间注意力机制RSAM和空间特征融合模块ASFF进一步的空间特征提取与融合得到五层有效特征层，特征融合与抗干扰网络FFAN使用分类与回归网络来预测结果。如图5所示，分类回归网络包括分类网络和回归网络，分类网络采用3次64通道的卷积和1次n_b×n_c的卷积以预测每一个目标框的类别，回归网络采用3次64通道的卷积和1次n_b×4的卷积以用于预测每一个先验框的回归情况，其中n_b指的是该特征层所拥有的先验框数量，n_c指的是网络目标检测类别数。In the end, the feature map is further extracted and fused by using the spatial attention mechanism RSAM and the spatial feature fusion module ASFF to obtain five effective feature layers. The feature fusion and anti-interference network FFAN uses the classification and regression network to predict the results. As shown in Figure 5, the classification regression network includes a classification network and a regression network. The classification network uses three 64-channel convolutions and one n_b×n_c convolution to predict the category of each target frame, and the regression network uses three 64-channel convolutions. Channel convolution and 1 n_b×4 convolution are used to predict the regression of each prior frame, where n_b refers to the number of prior frames owned by the feature layer, and n_c refers to the network target detection category number.

将输入图像输入预先完成训练的特征融合与抗干扰网络FFAN得到目标检测的目标框之前还包括预先训练特征融合与抗干扰网络FFAN的步骤，本实施例中将CIoU Loss进行了改进，然后将AIoU Loss引入Focal Loss中，提出了AIF Loss，训练特征融合与抗干扰网络FFAN时采用的损失函数L_AIF为：Before inputting the input image into the pre-trained feature fusion and anti-interference network FFAN to obtain the target frame of the target detection, the step of pre-training feature fusion and anti-interference network FFAN is also included. In this embodiment, CIoU Loss is improved, and then AIoU Loss is introduced into Focal Loss, and AIF Loss is proposed. The loss function _LAIF used in training feature fusion and anti-interference network FFAN is:

L_AIF＝L_FL+L_AIoU，L _AIF =L _FL +L _AIoU ,

L_AIoU＝L_CIoU+pw，

L _AIoU = L _CIoU + pw,

上式中，L_CIoU为CIoU回归损失，a和a^t分别表示预测框和真实框的面积，p为面积的权重函数，w为面积的相似性，IoU为交并比。CIoU Loss在DIoU Loss的基础上加入了长宽比惩罚项，这一改进能在一定程度上提高网络回归精度。但是当出现共中心点、预测框和真实框的长宽比也相同时，CIoU Loss也会退化为IoU Loss，如图6所示。因此本实施例中训练特征融合与抗干扰网络FFAN时采用的损失函数L_AIF设计了惩罚项pw，惩罚项pw考虑了预测框与真实框的面积的差异性，可以使预测框的面积更接近真实框的面积。如图6所示，相比预测框1，预测框2的面积更接近预测框1真实框的面积，预测框2的L_CIoU等于预测框1的L_CIoU，预测框1的L_AIoU大于预测框2的L_AIoU。In the above formula, L _CIoU is the CIoU regression loss, a and a ^t represent the area of the predicted frame and the real frame, respectively, p is the weight function of the area, w is the similarity of the area, and IoU is the intersection and union ratio. CIoU Loss adds an aspect ratio penalty item to DIoU Loss. This improvement can improve the network regression accuracy to a certain extent. However, when there is a common center point and the aspect ratio of the predicted box and the real box are the same, CIoU Loss will also degenerate into IoU Loss, as shown in Figure 6. Therefore, the loss function _LAIF used in training feature fusion and anti-jamming network FFAN in this embodiment designs a penalty item pw, which takes into account the difference between the area of the predicted frame and the real frame, which can make the area of the predicted frame closer to The area of the ground truth box. As shown in Figure 6, compared with the prediction frame 1, the area of the prediction frame 2 is closer to the area of the prediction frame 1 real frame, the L _CIoU of the prediction frame 2 is equal to the L _CIoU of the prediction frame 1, and the L _AIoU of the prediction frame 1 is larger than the prediction frame 2 L _AIoU .

参见图2，本实施例还提供一种面向SAR图像的目标自动检测方法，包括：Referring to Fig. 2, the present embodiment also provides a method for automatic target detection of SAR images, including:

S2)将各窗口图像作为输入图像，采用本实施例前述面向SAR图像的目标自动检测方法的步骤进行目标检测得到目标框；S2) Using each window image as an input image, using the steps of the aforementioned SAR image-oriented target automatic detection method of this embodiment to perform target detection to obtain a target frame;

本实施例步骤S1)中以带重叠区域的滑窗的方式生成窗口图像时，相邻的窗口会有20％的重叠；由于滑窗检测有一定的重叠率，会导致重叠框的产生，因此步骤S4)需要过滤原SAR图像中重叠的目标框，得到作为最终目标检测结果的目标框。参见图2，本实施例步骤S4)中过滤原SAR图像中重叠的目标框采用的方法为非极大值抑制方法(NMS算法)，毫无疑问，也可以根据需要采用其他已知的过滤方法或算法。在检测过程中，特征融合与抗干扰网络FFAN一个目标上会生成多个置信度不同的预测的目标框，本实施例中使用NMS算法去除这些多余的候选框并保留与目标最匹配的框,这样可以有效地减少虚警的数量。In step S1) of this embodiment, when the window image is generated in the form of a sliding window with overlapping regions, adjacent windows will overlap by 20%; since the sliding window detection has a certain overlapping rate, overlapping frames will be generated, so Step S4) It is necessary to filter the overlapping target frames in the original SAR image to obtain the target frame as the final target detection result. Referring to Fig. 2, the method used to filter the overlapping target frames in the original SAR image in step S4) of this embodiment is the non-maximum value suppression method (NMS algorithm). Undoubtedly, other known filtering methods can also be used as needed or algorithm. In the detection process, the feature fusion and anti-interference network FFAN will generate multiple predicted target frames with different confidence levels on one target. In this embodiment, the NMS algorithm is used to remove these redundant candidate frames and retain the frame that best matches the target. This can effectively reduce the number of false alarms.

下文将对本实施例方法中的特征融合与抗干扰网络FFAN进行验证实验。实验中使用了十几景Gaofen-3系统1m分辨率的机场影像。数据集包含4396张尺寸为500×500的飞机样本与及对应的标签文件，训练集与验证集的比例为4：1。此外，本实施例中使用数据集未使用的SAR图像进行了飞机检测的独立测试实验。本实施例中实验均在Ubuntu 18.04系统的pytorch框架下进行。所有网络都使用相同的数据集进行训练。每个模型都在两个RTX2080Ti GPU上进行100个epoch的训练,初始学习率为0.0005，批量大小batchsize为16。输入图像大小为512×512像素。为公平起见，本实施例中中的Efficientdet,YOLOV4,EWFAN,FFAN四个模型均未使用预训练模型和自动数据增强功能，且在测试过程中不使用任何高级测试技术，例如Softer-NMS或数据增强等。在测试阶段，这四个模型的NMS阈值均设为0.5。In the following, a verification experiment will be conducted on the feature fusion and anti-jamming network FFAN in the method of this embodiment. In the experiment, more than a dozen airport images of 1m resolution of the Gaofen-3 system were used. The data set contains 4396 aircraft samples with a size of 500×500 and the corresponding label files. The ratio of the training set to the verification set is 4:1. In addition, in this embodiment, an independent test experiment of aircraft detection was carried out using SAR images not used in the dataset. The experiments in this example are all carried out under the pytorch framework of the Ubuntu 18.04 system. All networks are trained using the same dataset. Each model is trained on two RTX2080Ti GPUs for 100 epochs, with an initial learning rate of 0.0005 and a batch size of 16. The input image size is 512×512 pixels. For the sake of fairness, none of the four models Efficientdet, YOLOV4, EWFAN, and FFAN in this example use pre-training models and automatic data enhancement functions, and do not use any advanced testing techniques such as Softer-NMS or data enhancement etc. During the testing phase, the NMS thresholds of the four models are all set to 0.5.

本实施例中使用检测率(DR)和虚警率(FAR)两个评价指标来衡量网络的效果，具体计算公式如下：In this embodiment, two evaluation indicators, detection rate (DR) and false alarm rate (FAR), are used to measure the effect of the network, and the specific calculation formula is as follows:

其中，检测率表示网络正确检测到的飞机目标数量(C)与标签中飞机目标数量(L)之比，虚警率是虚警个数与网络输出的预测框数量(S)之比。Among them, the detection rate represents the ratio of the number of aircraft targets correctly detected by the network (C) to the number of aircraft targets in the label (L), and the false alarm rate is the ratio of the number of false alarms to the number of prediction frames output by the network (S).

为验证提出的算法的性能，本实施例中对Gaofen-3系统中的机场Ⅰ(虹桥机场，12000×14400)和机场Ⅱ(首都机场，14400×16800)进行了独立测试，并对飞机检测性能进行了分析评价。飞机检测结果如表1所示。In order to verify the performance of the proposed algorithm, in this example, the airport I (Hongqiao Airport, 12000×14400) and the airport II (Capital Airport, 14400×16800) in the Gaofen-3 system were independently tested, and the aircraft detection performance Analytical evaluation was carried out. The test results of the aircraft are shown in Table 1.

表1：不同网络SAR图像飞机检测结果对比。Table 1: Comparison of aircraft detection results in SAR images of different networks.

机场Ⅰ和机场Ⅱ都是大型民用机场。机场中飞机目标具有种类多、尺寸偏小、分布很密集等特点，而且图像背景较复杂，这些都增大了检测难度。机场Ⅰ和机场Ⅱ分别有127架和143架飞机。综合两个机场的检测结果分析，EfficientDet的检测效果最差，它的虚警率很高，平均虚警率达到了40.58％，平均检测率也偏低。与EfficientDet相比，YOLOv4和EWFAN的虚警率略有降低，分别为31.95％和18.20％，但是虚警率仍然偏高。FFAN(本实施例)的检测效果最好，它的平均检测率和虚警率分别86.26％和8.42％，说明该网络能很好地提取特征以及提高抗背景干扰的能力。从测试时间上看，FFAN(本实施例)的平均测试时间23.87s，与EfficientDet相比只提高了0.83s,因此,FFAN(本实施例)在测试速度基本不受影响的同时大大提升了整体检测效果，具有较大的工程应用价值。Both Airport I and Airport II are large civil airports. The aircraft targets in the airport have the characteristics of many types, small size, dense distribution, etc., and the image background is complex, which increases the difficulty of detection. Airport I and Airport II have 127 and 143 aircraft respectively. Based on the analysis of the detection results of the two airports, EfficientDet has the worst detection effect. Its false alarm rate is very high, with an average false alarm rate of 40.58%, and the average detection rate is also low. Compared with EfficientDet, the false alarm rates of YOLOv4 and EWFAN are slightly lower, 31.95% and 18.20%, respectively, but the false alarm rates are still high. FFAN (this embodiment) has the best detection effect, its average detection rate and false alarm rate are 86.26% and 8.42%, respectively, which shows that the network can extract features well and improve the ability to resist background interference. From the perspective of test time, the average test time of FFAN (this embodiment) is 23.87s, which is only 0.83s higher than that of EfficientDet. Therefore, FFAN (this embodiment) greatly improves the overall test speed while the test speed is basically unaffected. The detection effect has great engineering application value.

综上所述，目标检测是高分辨合成孔径雷达(SAR)领域的热点问题，针对SAR图像飞机目标背景复杂、特征具有多样性等问题，本实施例提出了一种高效的飞机目标自动检测方法，该方法基于特征融合与抗干扰网络FFAN可以更加有效地提取与融合飞机特征，快速获取满意的检测结果，其检测率和虚警率均优于现有的EfficientDet、EWFAN、YOLOv4等飞机检测方法。该方法基于特征融合与抗干扰网络FFAN可以用于SAR图像目标检测，能在较短的测试时间内达到令人满意的检测效果，且也可以用于SAR图像中其他密集目标的检测与识别，具有的实际应用价值。To sum up, target detection is a hot issue in the field of high-resolution synthetic aperture radar (SAR). Aiming at the problems of complex background and diverse features of aircraft targets in SAR images, this embodiment proposes an efficient automatic detection method for aircraft targets , this method is based on feature fusion and anti-interference network FFAN, which can more effectively extract and fuse aircraft features, and quickly obtain satisfactory detection results. Its detection rate and false alarm rate are superior to existing EfficientDet, EWFAN, YOLOv4 and other aircraft detection methods . This method based on feature fusion and anti-jamming network FFAN can be used for SAR image target detection, and can achieve satisfactory detection results in a short test time, and can also be used for the detection and recognition of other dense targets in SAR images. It has practical application value.

此外，本实施例还提供一种面向SAR图像的目标自动检测系统，包括相互连接的微处理器和存储器，该微处理器被编程或配置以执行前述面向SAR图像的目标自动检测方法的步骤。此外，本实施例还提供一种计算机可读存储介质，该计算机可读存储介质中存储有被编程或配置以执行前述面向SAR图像的目标自动检测方法的计算机程序。In addition, this embodiment also provides an automatic target detection system for SAR images, including a microprocessor and a memory connected to each other, and the microprocessor is programmed or configured to execute the steps of the aforementioned automatic target detection method for SAR images. In addition, this embodiment also provides a computer-readable storage medium, where a computer program programmed or configured to execute the aforementioned method for automatic target detection oriented to SAR images is stored.

本领域内的技术人员应明白，本申请的实施例可提供为方法、系统、或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可读存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器，使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中，使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品，该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上，使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理，从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-readable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram. These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram. These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.

以上所述仅是本发明的优选实施方式，本发明的保护范围并不仅局限于上述实施例，凡属于本发明思路下的技术方案均属于本发明的保护范围。应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明原理前提下的若干改进和润饰，这些改进和润饰也应视为本发明的保护范围。The above descriptions are only preferred implementations of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions under the idea of the present invention belong to the protection scope of the present invention. It should be pointed out that for those skilled in the art, some improvements and modifications without departing from the principles of the present invention should also be regarded as the protection scope of the present invention.

Claims

1. A target automatic detection method facing SAR images is characterized by comprising the steps of inputting input images into a feature fusion and anti-interference network FFAN which is trained in advance to obtain a target frame of target detection, wherein the feature fusion and anti-interference network FFAN comprises the following steps:

a backbone network, which is used for down-sampling the input image into features of different scales, including low-level features P3, middle-level features P4-P6 and high-level features P7;

the pooling and channel shuffling module is used for extracting and fusing the intermediate-level features P4-P6 respectively to obtain corresponding shuffling features P4 '-P6';

the depth fusion module is used for performing depth fusion on the low-level features P3, the shuffle features P4 'to P6' and the high-level features P7 to respectively obtain the corresponding low-level features P3', the shuffle features P4' to P6 'and the high-level features P7' after depth fusion;

the spatial attention mechanism RSAM is respectively used for further spatial feature extraction and fusion of the deeply fused shuffle feature P6 'and the high-level feature P7';

the spatial feature fusion module ASFF is used for further spatial feature extraction and fusion of the deeply fused low-level features P3' and the shuffled features P4' to P5' respectively;

the classification regression network is used for performing classification regression on all spatial features output by the spatial attention mechanism RSAM and the spatial feature fusion module ASFF through extraction and fusion to generate a target frame for target detection;

the pooling and channel shuffling module comprises three shuffling modules including a shuffling module PSM-l, a shuffling module PSM-m and a shuffling module PSM-s, wherein the shuffling module PSM-l is used for extracting and fusing intermediate-level features P4 to obtain corresponding shuffling features P4 respectively, the shuffling module PSM-m is used for extracting and fusing intermediate-level features P5 to obtain corresponding shuffling features P5 respectively, the shuffling module PSM-s is used for extracting and fusing intermediate-level features P6 to obtain corresponding shuffling features P6 respectively, and the shuffling module PSM-l, the shuffling module PSM-m and the shuffling module PSM-s respectively comprise:

the channel separation module is used for dividing the channel of the input characteristics into two parts;

the maximum pooling operation module is used for performing maximum pooling operation on a part of channels output by the channel separation module;

the average pooling operation module is used for carrying out average pooling operation on the other part of channels output by the channel separation module;

the splicing module is used for splicing the output of the maximum pooling operation module, the output of the average pooling operation module and the original input characteristics of the input channel separation module;

a shuffling module for shuffling the output of the splicing module to enrich the information of each channel;

the convolution module is used for carrying out convolution operation on the output of the shuffling module to enable the number of output characteristic channels to be the same as that of input characteristics;

the sizes of the pooling cores of the shuffling module PSM-l, the shuffling module PSM-m and the shuffling module PSM-s for performing the maximum pooling operation and the average pooling operation are 9,15,25, 5,11 and 3 respectively.

2. The SAR image-oriented target automatic detection method according to claim 1, wherein the backbone network is an Efficientnet network, the Efficientnet network sequentially performs down-sampling of different sizes of 512 × 512 input images respectively to obtain corresponding features P1 to P7, and the features obtained by the 3 rd to 7 th down-sampling are respectively used as a low-level feature P3, middle-level features P4 to P6 and a high-level feature P7.

3. The SAR image-oriented target automatic detection method according to claim 1, characterized in that the depth fusion module is an improved structure of BiFPN, and the improved structure adds four jump connections on the basis of BiFPN, wherein the first jump connects the middle-level feature P6 and the shuffle feature P5', the second jump connects the middle-level feature P5 and the shuffle feature P4', the third jump connects the middle-level feature P5 and the middle-level feature P7, and the fourth jump connects the middle-level feature P4 and the middle-level feature P6.

4. The SAR image-oriented target automatic detection method according to claim 1, characterized in that the classification regression network comprises a classification network and a regression network, the classification network adopts 3 times of convolution of 64 channels and 1 time of convolution of n _ b × n _ c to predict the category of each target frame, the regression network adopts 3 times of convolution of 64 channels and 1 time of convolution of n _ b × 4 to predict the regression condition of each prior frame, wherein n _ b refers to the number of prior frames owned by the feature layer, and n _ c refers to the number of network target detection categories.

5. The SAR image-oriented target automatic detection method according to claim 1, wherein a step of pre-training a feature fusion and anti-interference network FFAN is further included before inputting an input image into a feature fusion and anti-interference network FFAN which is trained in advance to obtain a target frame for target detection, and a loss function L adopted when training the feature fusion and anti-interference network FFAN _AIF Comprises the following steps:

L _AIF ＝L _FL +L _AIoU ，

in the above formula, L _FL Is the Focal loss classification loss, L _AIoU Is AIoU regression loss, and has:

L _AIoU ＝L _CIoU +pw，

in the above formula, L _CIoU For CIoU regression loss, a and a ^t Respectively representing the areas of the prediction frame and the real frame, wherein p is a weight function of the areas, w is the similarity of the areas, and IoU is an intersection ratio.

6. An automatic target detection method for SAR images is characterized by comprising the following steps:

s1) generating a window image from an input SAR image in a sliding window mode with an overlapping area;

s2) taking each window image as an input image, and carrying out target detection by adopting the steps of the SAR image-oriented target automatic detection method in any one of claims 1-5 to obtain a target frame;

s3) converting the target frame coordinates of each window image into the original SAR image;

and S4) filtering the target frames overlapped in the original SAR image to obtain the target frame serving as the final target detection result.

7. The SAR image-oriented target automatic detection method as claimed in claim 6, characterized in that the method adopted in step S4) for filtering the target frames overlapped in the original SAR image is a non-maximum suppression method.

8. An automatic target detection system for SAR images, comprising a microprocessor and a memory connected to each other, characterized in that the microprocessor is programmed or configured to execute the steps of the automatic target detection method for SAR images according to any one of claims 1 to 7.

9. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program programmed or configured to execute the SAR image-oriented object automatic detection method according to any one of claims 1 to 7.