CN114299394A

CN114299394A - An Intelligent Interpretation Method of Remote Sensing Image

Info

Publication number: CN114299394A
Application number: CN202111654181.6A
Authority: CN
Inventors: 袁晓军; 周乐乐; 吴帅
Original assignee: Zhuhai Hanchen Technology Co ltd
Current assignee: Zhuhai Hanchen Technology Co ltd
Priority date: 2021-12-30
Filing date: 2021-12-30
Publication date: 2022-04-08

Abstract

The invention belongs to the field of computer vision, and in particular relates to an intelligent remote sensing image interpretation method. The present invention finally realizes the intelligent interpretation of remote sensing images by using semantic segmentation technology, vectorization technology, and vector simplification and smoothing technology. By using the invention, it is possible to input remote sensing images of any size, and directly output the corresponding vector results. Moreover, the interpretation time of a medium-sized district/county remote sensing image with an area of about 400 square kilometers and a remote sensing resolution of 0.8 meters can be controlled to about 1 hour, while manual interpretation takes 2 weeks, which greatly improves the interpretation. Efficiency, reducing the cost of manual interpretation and realizing the automation of interpretation. In addition, there is no similar automatic interpretation product on the market at present, which fills the market gap of intelligent interpretation of remote sensing images.

Description

An Intelligent Interpretation Method of Remote Sensing Image

技术领域technical field

本发明属于计算机视觉领域，具体的说是涉及一种智能遥感影像解译方法。The invention belongs to the field of computer vision, in particular to an intelligent remote sensing image interpretation method.

背景技术Background technique

从遥感影像中识别提取不同类别地物要素信息是遥感影像处理领域的一大课题。在遥感领域，从影像中提取地物类别信息称之为遥感影像解译。提取的信息包括地物的类别，形状，面积等。相关统计部门每年都需要统计同一区域的地物信息，并且与历年的信息进行比对，以了解和跟踪区域内地物要素发展变化情况，为政府、企业等相关部门提供数据支撑，辅助决策等。传统的做法是使用专业的遥感影像解译软件ArcGis/QGIS/Grass等，由经过培训的专业解译人员，通过对比遥感影像，在软件中沿着地物轮廓以画点连线的方式手工绘制封闭矢量多边形(Polygon)，通常解译的区域以行政区划分，比如以县为单位，对于面积约400平方千米，遥感分辨率为0.8米的1个中等大小的县行政区，1个专业人员绘制大约需要花费2周时间，非常耗时费力。同时，绘制精度由于个体差异，疲劳等主观性因素影响，会出现前后分类标准不一致，导致解译错误的现象，解译效果不佳。Identifying and extracting different types of ground feature information from remote sensing images is a major subject in the field of remote sensing image processing. In the field of remote sensing, the extraction of object category information from images is called remote sensing image interpretation. The extracted information includes the category, shape, area, etc. of the features. Relevant statistical departments need to count the feature information of the same area every year, and compare it with the information of the past years, so as to understand and track the development and changes of the feature elements in the region, and provide data support for the government, enterprises and other relevant departments to assist decision-making, etc. The traditional method is to use professional remote sensing image interpretation software such as ArcGis/QGIS/Grass, etc., by trained professional interpreters, by comparing remote sensing images, in the software along the contour of the ground objects in the way of drawing points by hand to draw closed lines. Vector polygon (Polygon), usually interpreted areas are divided into administrative areas, such as counties, for a medium-sized county administrative area with an area of about 400 square kilometers and a remote sensing resolution of 0.8 meters, a professional draws about It takes 2 weeks, which is very time-consuming and labor-intensive. At the same time, due to the influence of subjective factors such as individual differences and fatigue, the rendering accuracy will be inconsistent with the classification standards before and after, resulting in the phenomenon of interpretation errors, and the interpretation effect is not good.

由于人工智能深度学习技术特别是卷积神经网络技术在计算机视觉图像处理识别领域取得的巨大进步和成功，吸引了众多的研究者将人工智能神经网络技术应用于遥感影像处理中。在技术路线上，遥感影像分割任务归属于计算机视觉中的语义分割任务，即将遥感图像中的不同目标像素划分为不同的类别，在生成的遥感分割结果图像中，同一地物类别用相同的像素值表示，不同像素值代表不同类别。在深度学习语义分割技术中，XiaLi,Zhisheng Zhong,Jianlong Wu,Yibo Yang,Zhouchen Lin,Hong Liu.Expectation-Maximization Attention Networks for Semantic Segmentation将期望最大化注意力机制引入卷积神经网络，提出了EMANet实现了语义分割操作。考虑到空间金字塔池化模块可以捕获多尺度信息，encoder-decoder架构可以更好的捕捉尖锐物体的边缘，Liang-ChiehChen,Yukun Zhu,George Papandreou,Florian Schroff,and Hartwig Adam.Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation在DeepLabv3的基础上通过添加Decoder模块并且将深度可分离卷积应用到ASPP和Decoder模块中，提出了基于Encoder-Decoder网络架构的DeepLabv3+。Yuhui Yuan,Xilin Chen,andJingdong Wang.Object-Contextual Representations for Semantic Segmentation采用HRNetV2+OCR模块在CityScape数据集上取的了SOTA的性能。Due to the great progress and success of artificial intelligence deep learning technology, especially convolutional neural network technology in the field of computer vision image processing and recognition, many researchers have attracted many researchers to apply artificial intelligence neural network technology to remote sensing image processing. On the technical route, the remote sensing image segmentation task belongs to the semantic segmentation task in computer vision, that is, the different target pixels in the remote sensing image are divided into different categories. In the generated remote sensing segmentation result image, the same feature category uses the same pixel. Value representation, different pixel values represent different categories. In the deep learning semantic segmentation technology, XiaLi, Zhisheng Zhong, Jianlong Wu, Yibo Yang, Zhouchen Lin, Hong Liu. Expectation-Maximization Attention Networks for Semantic Segmentation introduced the expectation maximization attention mechanism into the convolutional neural network, and proposed the EMANet implementation Semantic segmentation operation. Considering that the spatial pyramid pooling module can capture multi-scale information, the encoder-decoder architecture can better capture the edges of sharp objects, Liang-ChiehChen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam.Encoder-Decoder with Atrous Separable On the basis of DeepLabv3, Convolution for Semantic Image Segmentation proposes DeepLabv3+ based on Encoder-Decoder network architecture by adding Decoder module and applying depthwise separable convolution to ASPP and Decoder modules. Yuhui Yuan, Xilin Chen, and Jingdong Wang. Object-Contextual Representations for Semantic Segmentation adopts HRNetV2+OCR module to obtain the performance of SOTA on the CityScape dataset.

由于卷积神经网络的训练通常需要大量的具有标注的训练数据，而遥感影像数据通常比较敏感，涉及国家地理信息安全，仅有少数具有权限的政府部门或者涉密企业，高校，研究机构有这些数据，一般很少有公开的具有良好标注的数据集，这也为技术研究的提升带来了诸多困难和壁垒。近年来，也有政府部门及企业举办一些比赛，会提供一些脱敏、脱密的数据集，吸引更多的深度学习方向的研究者们参与进来，以推动这一领域的技术进步。目前，这一领域核心技术的难点仍然是在像素级别的分类结果的性能上。由于遥感影像存在许多同谱异物(即像素值相同但所属类别不同)，同物异谱(即同一目标因拍摄设备，高度，光照，天气，日期等情况导致像素值不同)，这就为遥感影像分割技术的性能提升带来了极大的挑战，也是目前遥感影像领域亟待攻克的一大难关。Because the training of convolutional neural networks usually requires a large amount of labeled training data, and remote sensing image data is usually sensitive and involves the security of national geographic information, only a few authorized government departments or secret-related enterprises, universities, and research institutions have these Generally, there are few well-labeled datasets publicly available, which also brings many difficulties and barriers to the improvement of technical research. In recent years, some government departments and enterprises have also held some competitions, which will provide some desensitized and declassified data sets to attract more researchers in the direction of deep learning to participate in order to promote technological progress in this field. At present, the difficulty of the core technology in this field is still the performance of pixel-level classification results. Since there are many foreign objects of the same spectrum in remote sensing images (that is, the pixel values are the same but belong to different categories), and the same objects are different in spectrum (that is, the same target has different pixel values due to shooting equipment, altitude, illumination, weather, date, etc.), which is a remote sensing image. The performance improvement of image segmentation technology has brought great challenges, and it is also a major difficulty to be overcome in the field of remote sensing imagery.

现在遥感影像分割方法，多集中在遥感影像分割技术，即像素级别的分类性能提升上面，仍未有成熟的商业产品出现。At present, remote sensing image segmentation methods mostly focus on remote sensing image segmentation technology, that is, the improvement of pixel-level classification performance, and there are still no mature commercial products.

发明内容SUMMARY OF THE INVENTION

本发明基于深度学习语义分割技术，利用自有遥感影像数据集，实现了遥感影像的智能解译。输入遥感影像，即可得到面矢量(Polygon)的解译结果。图1为本发明的整体网络结构及处理流程图，图2本发明输出的矢量结果与输入的遥感影像的叠加展示。Based on the deep learning semantic segmentation technology, the present invention realizes the intelligent interpretation of remote sensing images by using its own remote sensing image data set. Input the remote sensing image, you can get the interpretation result of the surface vector (Polygon). FIG. 1 is the overall network structure and processing flow chart of the present invention, and FIG. 2 is a superimposed display of the vector result output by the present invention and the input remote sensing image.

为实现上述目的，本发明的技术方案为：For achieving the above object, the technical scheme of the present invention is:

一种智能遥感影像解译方法，获取到遥感影像后，解译方法包括以下步骤：An intelligent remote sensing image interpretation method. After obtaining the remote sensing image, the interpretation method includes the following steps:

S1、对原始遥感图像按需进行分割，分割后获得多个目标图像；其中在分割的时候，使获得的相邻目标图像具有重叠部分；S1. Segment the original remote sensing image as needed, and obtain multiple target images after segmentation; wherein, during segmentation, the obtained adjacent target images have overlapping parts;

S2、采用深度学习的方法对获得的所有目标图像进行逐像素的分类，生成预测分类结果灰度图；具体为：S2. Use the deep learning method to classify all the obtained target images pixel by pixel, and generate a grayscale image of the predicted classification result; specifically:

将目标图像分别送入三个不同的语义分割深度学习模型，分别为DeepLabv3、HRNetW48+OCR、EMANet网络，然后利用集成学习方法，对任一像素点是否属于某一类别的判定由三个网络各自输出的结果投票决定，获得多数票数的判定结果即为最终该像素点的类别结果；同时语义分割深度学习模型还输出一个各像素属于某一类别的概率图，即置信度；The target image is sent to three different deep learning models for semantic segmentation, namely DeepLabv3, HRNetW48+OCR, and EMANet networks, and then the ensemble learning method is used to determine whether any pixel belongs to a certain category by the three networks. The output result is decided by voting, and the decision result obtained by the majority of votes is the final category result of the pixel; at the same time, the semantic segmentation deep learning model also outputs a probability map that each pixel belongs to a certain category, that is, the confidence level;

S3、将所有目标图像的预测分类结果按照其在原始图像的位置进行拼接，获得原始遥感图像的预测结果灰度图；同理对概率图进行拼接；S3, splicing the predicted classification results of all target images according to their positions in the original image to obtain a grayscale image of the predicted result of the original remote sensing image; similarly, splicing the probability map;

S4、对预测结果灰度图中的狭长的条带状断裂要素进行连接处理；S4. Perform connection processing on the long and narrow strip-shaped fracture elements in the grayscale image of the prediction result;

S5、对预测结果灰度图中小图斑进行滤除，具体为将像素数量小于类别指定阈值的独立图斑用其周围像素类别进行填充；S5, filtering out the small spots in the grayscale image of the prediction result, specifically filling the independent spots whose number of pixels is less than the specified threshold of the category with the surrounding pixel categories;

S6、将经过步骤S5得到的预测结果灰度图转为矢量shp文件，同时计算每个图斑的置信度及面积作为图斑的属性；S6, converting the grayscale image of the prediction result obtained in step S5 into a vector shp file, and calculating the confidence and area of each spot as an attribute of the spot;

S7、利用开源软件GrassGis中的Douglas-Peucker算法对矢量shp文件区域边界进行简化和平滑，消除地物边缘锯齿效应，从而获得遥感影像的解译结果。S7. Use the Douglas-Peucker algorithm in the open source software GrassGis to simplify and smooth the area boundary of the vector shp file, eliminate the jagged edge effect of the ground object, and obtain the interpretation result of the remote sensing image.

本发明的有益效果为：本发明可实现遥感影像的智能解译，输入栅格(像素)遥感影像，可直接输出矢量解译结果，极大的减少了人工解译的难度及工作量，极大的提升了解译速度，实现了解译自动化。The beneficial effects of the present invention are as follows: the present invention can realize intelligent interpretation of remote sensing images, input grid (pixel) remote sensing images, and directly output vector interpretation results, which greatly reduces the difficulty and workload of manual interpretation, and greatly reduces the difficulty and workload of manual interpretation. Greatly improve the speed of interpretation and realize the automation of interpretation.

本发明的方法，将遥感影像分割，栅格矢量化，矢量简化平滑三个环节打通，实现了自动化的遥感智能解译系统，输出矢量表示的地物信息，极大的提升了解译效率和解译自动化。输入任意尺寸的遥感影像，通过本发明的智能解译方法，可以直接输出其对应的矢量解译结果。The method of the invention opens up the three links of remote sensing image segmentation, grid vectorization, and vector simplification and smoothing, realizes an automatic remote sensing intelligent interpretation system, outputs the ground object information represented by vectors, and greatly improves the interpretation efficiency and interpretation. Translation automation. Input a remote sensing image of any size, and through the intelligent interpretation method of the present invention, the corresponding vector interpretation result can be directly output.

附图说明Description of drawings

图1为本发明遥感影像智能解译方法的处理流程步骤及核心处理模块。FIG. 1 shows the processing flow steps and core processing modules of the remote sensing image intelligent interpretation method of the present invention.

图2为系统解译的矢量结果与输入的遥感影像的叠加展示。Figure 2 is a superimposed display of the vector result interpreted by the system and the input remote sensing image.

图3为栅格转换为矢量之后的结果示意图。Figure 3 is a schematic diagram of the result after the raster is converted into a vector.

图4为矢量简化平滑之后的结果的局部结果放大对比图。FIG. 4 is an enlarged comparison diagram of a partial result of the result after vector simplification and smoothing.

具体实施方式Detailed ways

下面结合附图对本发明进行详细说明，并对流程过程中的某些处理进行展示。The present invention will be described in detail below with reference to the accompanying drawings, and some processes in the flow process will be shown.

本发明可以处理八分类及二十五分类类别的遥感影像自动解译。分类划分标准及对应数值代码见表1(数字1-8代表八分类标准，其他为25分类的分类标准)：The invention can handle the automatic interpretation of remote sensing images of eight categories and twenty-five categories. The classification standards and corresponding numerical codes are shown in Table 1 (numbers 1-8 represent eight classification standards, and the others are classification standards of 25 classifications):

表1类划分标准及对应数值代码Table 1 Classification standard and corresponding numerical code

图1为本发明遥感影像智能解译方法的整体框架及处理流程。总体分为四个阶段，第一阶段为遥感影像分割预测阶段。这一阶段完成遥感影像的裁剪，独立预测，生成预测结果图和概率图，以及子图结果拼接融合。第二阶段，分割灰度结果后处理阶段。包括两个模块，一个连接断裂要素模块，另一个是滤除类别指定阈值下小图斑的模块。第三阶段，为矢量化阶段。完成栅格转矢量，同时为每个矢量图斑添加置信度和面积属性。第四阶段，矢量简化平滑模块。利用GrassGis中对Douglas-Peucker简化算法的实现对矢量的简化平滑，消除边缘锯齿，同时保证拓扑结构正确。具体为：FIG. 1 is the overall framework and processing flow of the intelligent interpretation method of remote sensing images according to the present invention. It is divided into four stages as a whole, the first stage is the remote sensing image segmentation prediction stage. This stage completes the cropping of remote sensing images, independent prediction, generation of prediction result maps and probability maps, and splicing and fusion of sub-image results. The second stage is the post-processing stage of the segmentation grayscale result. It includes two modules, one is a module for connecting fracture elements, and the other is a module for filtering out small patches under the specified threshold of the category. The third stage is the vectorization stage. Complete raster to vector while adding confidence and area attributes to each vector blob. The fourth stage, the vector simplification smoothing module. The implementation of Douglas-Peucker simplification algorithm in GrassGis is used to simplify and smooth the vector, eliminate edge jaggedness, and at the same time ensure the correct topology. Specifically:

第一阶段，遥感影像语义分割阶段。这一阶段主要利用深度学习算法，将遥感影像逐像素的分类，生成分类结果灰度图，像素值即为类别代码。在遥感影像处理领域，此像素分类结果灰度图也称为栅格结果图。通常遥感影像都是以行政单位划分的，比如某个县域的遥感影像，通常尺寸较大，比如影像宽高为20000x20000像素尺寸的。由于计算机内存及显卡内存等计算条件限制，无法一次性容纳原始大尺寸的图像，因此，将大尺寸图像裁剪成512x512的小尺度图像(具体尺寸可根据计算机内存及显卡内存的限制进行调整)，分别送进网络进行预测，然后再对各个小尺度图像的预测结果进行拼接。本发明提出根据计算机硬件的配置，灵活设置计算机可以处理的小尺寸图像，这样就赋予的系统以处理大尺寸遥感影像的能力。本发明输入图像的格式是遥感影像原始图像，比如带地理坐标及空间参考信息的后缀为tif或img格式的遥感影像。The first stage is the semantic segmentation stage of remote sensing images. In this stage, the deep learning algorithm is mainly used to classify remote sensing images pixel by pixel to generate a grayscale image of the classification result, and the pixel value is the class code. In the field of remote sensing image processing, this grayscale image of pixel classification result is also called raster result image. Usually remote sensing images are divided into administrative units, such as remote sensing images of a certain county, which are usually larger in size, such as image width and height of 20000x20000 pixels. Due to the limitation of computing conditions such as computer memory and graphics card memory, the original large-size image cannot be accommodated at one time. Therefore, the large-size image is cropped into a small-scale image of 512x512 (the specific size can be adjusted according to the limitations of computer memory and graphics card memory), They are respectively sent to the network for prediction, and then the prediction results of each small-scale image are stitched together. The present invention proposes to flexibly set small-size images that can be processed by the computer according to the configuration of computer hardware, so that the system is given the ability to process large-size remote sensing images. The format of the input image in the present invention is the original image of the remote sensing image, such as the remote sensing image with geographic coordinates and spatial reference information and the suffix is tif or img format.

裁剪时相邻小尺寸图像需要存在重叠部分，这是因为如果完全不重叠，由于各个小尺寸图像是单独通过网络进行预测的，会在图像边缘产生预测效果不一致的情况。比如一栋建筑恰好被分到了两张图像中，那么在拼接时就产生明显的拼接痕迹。本发明的解决方案是，裁剪时，相邻图像有重叠，比如两张相邻的小图像重叠200个像素，然后在预测的输出结果上对重叠区域取平均值，这样得到的预测结果就可以很大程度上减轻拼接痕迹。When cropping, adjacent small-sized images need to have overlapping parts, because if they do not overlap at all, since each small-sized image is predicted through the network separately, there will be inconsistent prediction effects at the edges of the images. For example, if a building happens to be divided into two images, obvious splicing marks will be produced when splicing. The solution of the present invention is that when cropping, adjacent images overlap, for example, two adjacent small images overlap by 200 pixels, and then average the overlapping area on the predicted output result, so that the obtained prediction result can be very large. To reduce the splicing marks to a certain extent.

在分割核心技术上，为了提升模型性能，本发明使用了3个模型进行集成，即每一张小尺寸遥感影像经过三个不同的模型，分别是DeepLabv3、EMANet以及HRNetW48+OCR。然后由这三个模型分别进行预测，然后，对同一个像素位置处是否属于某一类别进行投票，即当有两个及以上模型预测该像素点属于某一类别时，才确认这一位置为此类别，否则不是此类别。集成学习是机器学习中提升模型性能的常见技术。In the segmentation core technology, in order to improve the model performance, the present invention uses three models for integration, that is, each small-size remote sensing image passes through three different models, namely DeepLabv3, EMANet and HRNetW48+OCR. Then the three models make predictions respectively, and then vote on whether the same pixel position belongs to a certain category, that is, when two or more models predict that the pixel belongs to a certain category, the position is confirmed as This category, otherwise not this category. Ensemble learning is a common technique in machine learning to improve model performance.

同时，为了方便后续人工介入对模型预测的矢量结果进行纠错，分割模型还会输出一个各像素属于某一类别的概率图，也称置信度。矢量化之后，每个矢量图斑的置信度为图斑下各个像素概率的平均值。人为介入修改时就可以重点关注置信度低的矢量图斑，进行人工修改。At the same time, in order to facilitate subsequent manual intervention to correct the vector results predicted by the model, the segmentation model will also output a probability map that each pixel belongs to a certain category, also known as the confidence level. After vectorization, the confidence of each vector patch is the average value of the probability of each pixel under the patch. When human intervention is involved in the modification, the vector pattern with low confidence can be focused on, and manual modification can be carried out.

另外，由于遥感影像中，各个类别分布不均，比如八大类划分中耕地类别占比接近40％，而园地和水域占比仅约3-4％，从而导致模型训练过程中，对占比较少的类别预测效果不好。为了解决这一问题，本发明采用了Tsung-Yi Lin，Priya Goyal，Ross Girshick，Kaiming He，Piotr Dollar等人在Focal Loss for Dense Object Detection中提出的FocalLoss损失函数，同时引入由Jie Hu,Li Shen,Samuel Albanie,Gang Sun,Enhua Wu等人在Squeeze-and-Excitation Networks中提出的通道Attention机制以及Sanghyun Woo,Jongchan Park,Joon-Young Lee,and In So Kweon等人在CBAM:Convolutional BlockAttention Module中提出的将空间Attention机制和通道Attention机制一起使用的思想，进一步减轻类别不均对模型性能的影响。In addition, due to the uneven distribution of various categories in remote sensing images, for example, the eight categories of cultivated land accounted for nearly 40%, while the garden and water area accounted for only about 3-4%, resulting in the model training process, the proportion of The class prediction effect is not good. In order to solve this problem, the present invention adopts the FocalLoss loss function proposed in Focal Loss for Dense Object Detection by Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar, etc. , The channel Attention mechanism proposed by Samuel Albanie, Gang Sun, Enhua Wu et al in Squeeze-and-Excitation Networks and Sanghyun Woo, Jongchan Park, Joon-Young Lee, and In So Kweon et al in CBAM: Convolutional Block Attention Module The idea of using the spatial Attention mechanism and the channel Attention mechanism together further alleviates the impact of uneven categories on model performance.

最后，需要将这一阶段各个小图的预测结果拼接成与原始输入遥感影像相同宽高的预测结果灰度图。Finally, the prediction results of each small image at this stage need to be spliced into a grayscale image of the prediction result with the same width and height as the original input remote sensing image.

第二阶段，分割结果后处理阶段。由于模型需要对遥感影像输出的结果进行两个方面的处理。第一方面，发现预测的结果中，对于道路和河流等狭长条带状要素有很多断裂的情况，需要对其进行连接。第二方面，预测的结果中有一些零星散落的独立要素，在一定面积(像素数量)以下的滤除掉。本发明采用的方案是用小图斑周围的其他像素进行填充。各个类别滤除标准本发明中指定见表2，表3：The second stage is the post-processing stage of the segmentation results. Because the model needs to process the results of remote sensing image output in two aspects. First, it is found that in the predicted results, there are many fractures in long and narrow strip-like elements such as roads and rivers, and they need to be connected. Second, there are some scattered independent elements in the predicted results, which are filtered out below a certain area (number of pixels). The solution adopted in the present invention is to fill with other pixels around the small image spot. Each category filtering standard specified in the present invention is shown in Table 2, Table 3:

表2:八分类对应滤除阈值Table 2: Corresponding filtering thresholds for eight categories

表3:二十五分类对应滤除阈值Table 3: Twenty-five classifications correspond to filtering thresholds

完成断裂要素的连接和滤除小图斑之后，进入下一阶段处理。After completing the connection of the fracture elements and filtering out the small patches, the next stage of processing is entered.

第三阶段，栅格矢量化阶段。即将第二阶段后处理的栅格分类结果转化为用矢量多边形面(Polygon)表示的阶段。对于连通的具有相同像素值的区域使用该区域轮廓多边形表示。即栅格矢量化为面要素的表示形式。由于第二阶段的输出仅是图像格式，不包含影像的空间参考及地理坐标等信息，矢量化之前需要为第二阶段的结果添加原始遥感影像的空间参考及地理坐标信息。本发明使用开源库Gdal实现矢量化。矢量化技术通常都是沿着像素边缘生成矢量多边形。这种方式会给矢量化的结果带来大量的锯齿边缘。但优点是没有拓扑错误。即不存在面矢量多边形之间的重叠和缝隙等情况，也是主流的比较成功的商业化矢量手段。在矢量化的过程中，为每个图斑添加置信度属性，置信度的计算来源于遥感影像分割输出的概率图，为图斑下对应像素概率的均值。同时，计算各个矢量图斑的面积，作为图斑的面积属性。The third stage, the raster vectorization stage. That is, the raster classification result of the second stage post-processing is converted into a stage represented by a vector polygon surface (Polygon). For connected regions with the same pixel value, use the region outline polygon representation. That is, the raster is vectorized into a representation of polygon features. Since the output of the second stage is only in image format and does not contain information such as spatial reference and geographic coordinates of the image, it is necessary to add the spatial reference and geographic coordinates of the original remote sensing image to the results of the second stage before vectorization. The present invention uses the open source library Gdal to realize vectorization. Vectorization techniques typically generate vector polygons along pixel edges. This approach will give a lot of jagged edges to the vectorized result. But the advantage is that there are no topological errors. That is, there is no overlap and gap between the polygons of the surface vector, and it is also a mainstream and relatively successful commercial vector method. In the process of vectorization, a confidence attribute is added to each patch. The calculation of the confidence comes from the probability map output by remote sensing image segmentation, which is the mean value of the corresponding pixel probability under the patch. At the same time, the area of each vector patch is calculated as the area attribute of the patch.

第四阶段，矢量简化平滑阶段。在第三阶段矢量化之后，结果中存在锯齿，需要尽可能的去除不必要的锯齿边缘，比如将有阶梯形锯齿表示的直线边缘就用直线的两个顶点表示，去除中间不必要的锯齿点。由一个个紧邻的锯齿表示的弧线，使用比较光滑的弧线来表示。这里的技术通常涉及两种，一种是简化技术，一种是平滑技术。对于简化技术，即通过算法，尽可能的删除点，用较少的点来表示边缘，从而达到锯齿边直线的效果。另一种技术是平滑技术，这种技术的出发点是使用平滑的曲线来代替锯齿状的边缘线。这两种技术中最常见的问题是在简化或者平滑过程中，会导致拓扑结构错误，即相邻的面矢量间出现缝隙或者重叠的现象。有拓扑错误的结果会导致统计错误，不能使用。经过大量调研，本发明采用开源软件GrassGis中v.generalize模块中提供的Douglas-Peucker简化算法实现，可以做到消除锯齿同时保持拓扑结构的正确性。图3和图4展示了该算法对锯齿边缘简化平滑的效果。The fourth stage, the vector simplification smoothing stage. After the third stage of vectorization, there is jaggedness in the result, and it is necessary to remove unnecessary jagged edges as much as possible. For example, the edge of a straight line represented by stepped jagged teeth is represented by the two vertices of the straight line, and the unnecessary jagged points in the middle are removed. . The arcs represented by the adjacent sawtooth are represented by relatively smooth arcs. The techniques here usually involve two types, one is the simplification technique and the other is the smoothing technique. For the simplification technique, that is, through the algorithm, delete the points as much as possible, and use fewer points to represent the edge, so as to achieve the effect of a straight line with jagged edges. Another technique is smoothing, which starts with smooth curves instead of jagged edge lines. The most common problem between these two techniques is that during the simplification or smoothing process, it will lead to topology errors, that is, gaps or overlaps between adjacent surface vectors. Results with topology errors will cause statistical errors and cannot be used. After a lot of investigations, the present invention is realized by the Douglas-Peucker simplified algorithm provided in the v.generalize module of the open source software GrassGis, which can eliminate aliasing while maintaining the correctness of the topology structure. Figures 3 and 4 show the effect of this algorithm on simplification and smoothing of jagged edges.

本发明基于核心的人工智能算法，将遥感影像分割，分割结果后处理，栅格矢量化，矢量简化平滑等四个阶段打通，实现了对大尺寸遥感影像的智能解译，后期仅需人工介入对低置信度的预测结果进行修正，极大的降低了人工解译的强度，提高了解译的效率。实验验证发现，人工解译一个普通中等大小的区县需要2周左右的时间，而使用本发明的方法，可以做到2个小时即出结果，极大的提升了解译效率。据了解，市面上还未有成功的遥感影像自动解译的产品，本发明遥感智能解译系统填补了市场空白。Based on the core artificial intelligence algorithm, the invention opens up four stages of remote sensing image segmentation, post-processing of segmentation results, grid vectorization, vector simplification and smoothing, etc., and realizes intelligent interpretation of large-scale remote sensing images, requiring only manual intervention in the later stage. Correcting the prediction results with low confidence greatly reduces the intensity of manual interpretation and improves the efficiency of interpretation. Experiments have verified that it takes about 2 weeks to manually interpret an ordinary and medium-sized county, but using the method of the present invention, the result can be obtained in 2 hours, which greatly improves the interpretation efficiency. It is understood that there is no successful automatic interpretation product of remote sensing images on the market, and the remote sensing intelligent interpretation system of the present invention fills the market gap.

Claims

1. An intelligent remote sensing image interpretation method is characterized in that after a remote sensing image is acquired, the interpretation method comprises the following steps:

s1, segmenting the original remote sensing image as required to obtain a plurality of target images; wherein at the time of segmentation, adjacent target images obtained are caused to have an overlapping portion;

s2, classifying all the obtained target images pixel by adopting a deep learning method to generate a prediction classification result gray level image; the method specifically comprises the following steps:

respectively sending the target image into three different semantic segmentation deep learning models, namely a DeepLabv3, a HRNetW48+ OCR and an EMANet network, then determining whether any pixel belongs to a certain category by using an integrated learning method and voting according to results output by the three networks respectively, wherein the determination result of the number of votes obtained is the final category result of the pixel; meanwhile, the semantic segmentation deep learning model also outputs a probability map, namely confidence coefficient, of each pixel belonging to a certain category;

s3, splicing the prediction classification results of all target images according to the positions of the target images in the original images to obtain a prediction result gray-scale image of the original remote sensing image; splicing the probability graph in the same way;

s4, connecting the long and narrow strip-shaped broken elements in the prediction result gray-scale image;

s5, filtering small patches in the prediction result gray scale image, specifically filling independent patches with the number of pixels smaller than a category designated threshold with the categories of the pixels around the independent patches;

s6, converting the prediction result gray-scale image obtained in the step S5 into a vector shp file, and meanwhile, calculating the confidence coefficient and the area of each image spot as the attributes of the image spots;

s7, simplifying and smoothing the boundary of the vector shp file region by using a Douglas-Peucker algorithm in open source software GrassGis, and eliminating the sawtooth effect of the edges of the ground objects, thereby obtaining the interpretation result of the remote sensing image.