Nothing Special   »   [go: up one dir, main page]

CN114299394A - Intelligent interpretation method for remote sensing image - Google Patents

Intelligent interpretation method for remote sensing image Download PDF

Info

Publication number
CN114299394A
CN114299394A CN202111654181.6A CN202111654181A CN114299394A CN 114299394 A CN114299394 A CN 114299394A CN 202111654181 A CN202111654181 A CN 202111654181A CN 114299394 A CN114299394 A CN 114299394A
Authority
CN
China
Prior art keywords
remote sensing
interpretation
image
sensing image
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111654181.6A
Other languages
Chinese (zh)
Inventor
袁晓军
周乐乐
吴帅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Hanchen Technology Co ltd
Original Assignee
Zhuhai Hanchen Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Hanchen Technology Co ltd filed Critical Zhuhai Hanchen Technology Co ltd
Priority to CN202111654181.6A priority Critical patent/CN114299394A/en
Publication of CN114299394A publication Critical patent/CN114299394A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention belongs to the field of computer vision, and particularly relates to an intelligent remote sensing image interpretation method. According to the invention, the intelligent interpretation of the remote sensing image is finally realized by using a semantic segmentation technology, a vectorization technology and a vector simplified smoothing technology. The invention can realize the input of remote sensing images with any size and directly output the corresponding vector result. Moreover, the interpretation time of a medium-sized county remote sensing image with the area of about 400 square kilometers and the remote sensing resolution of 0.8 meter can be controlled to be about 1 hour, and the manual interpretation needs 2 weeks, so that the interpretation efficiency is greatly improved, the cost of the manual interpretation is reduced, and the automation of the interpretation is realized. Moreover, no similar automatic interpretation products exist in the current market, and the market blank of intelligent interpretation of the remote sensing images is filled.

Description

Intelligent interpretation method for remote sensing image
Technical Field
The invention belongs to the field of computer vision, and particularly relates to an intelligent remote sensing image interpretation method.
Background
The identification and extraction of element information of different types of ground objects from remote sensing images is a major topic in the field of remote sensing image processing. In the field of remote sensing, extracting ground object type information from an image is called remote sensing image interpretation. The extracted information includes the type, shape, area, etc. of the feature. The related statistical departments need to count the ground feature information of the same region every year and compare the information with the information of the past years so as to understand and track the development and change conditions of the ground feature elements in the region, and provide data support, aid decision making and the like for the related departments such as governments, enterprises and the like. The traditional method is to use professional remote sensing image interpretation software ArcGis/QGIS/Grass and the like, a trained professional interpreter manually draws a closed vector Polygon (Polygon) in the software in a dot-and-dot connection mode along the contour of a ground feature by comparing remote sensing images, generally interpreted areas are divided by administrative districts, for example, counties are taken as units, and 1 professional person draws 1 administrative district with a medium size of 1 county with an area of about 400 square kilometers and a remote sensing resolution of 0.8 meter, which takes about 2 weeks and is very time-consuming and labor-consuming. Meanwhile, due to the influence of subjective factors such as individual difference and fatigue, the drawing precision is inconsistent with the classification standard before and after the drawing precision, so that the interpretation error is caused, and the interpretation effect is poor.
Because of the great progress and success of the artificial intelligence deep learning technology, especially the convolution neural network technology, in the field of computer vision image processing and recognition, a great number of researchers are attracted to apply the artificial intelligence neural network technology to remote sensing image processing. On the technical route, the remote sensing image segmentation task belongs to a semantic segmentation task in computer vision, namely, different target pixels in a remote sensing image are divided into different categories, the same ground object category is represented by the same pixel value in the generated remote sensing segmentation result image, and the different pixel values represent different categories. In the deep learning Semantic Segmentation technology, Xia Li, Zhisheng Zhong, Jialong Wu, Yibo Yang, Zhouchen Lin, Hong Liu. Considering that a spatial pyramid pooling module can capture multi-scale information, an Encoder-Decoder architecture can better capture the edge of a sharp object, Liang-Chieh Chen, Yukun Zhu, George Papandrou, Florian Schroff, and Hartwig Adam Encoder-Decoder with associated data separation for Semantic Image Segmentation on the basis of deepLabv3, by adding a Decoder module and applying the depth Separable Convolution to ASPP and Decoder modules, deepLabv3+ based on the Encoder-Decoder network architecture is proposed. Yuhui Yuan, Xilin Chen, and Jingdong Wang object-Contextual responses for Semantic Segmentation adopted the performance of SOTA on the CityScape dataset by HRNetV2+ OCR module.
Because training of the convolutional neural network usually needs a large amount of labeled training data, and the remote sensing image data is usually sensitive and relates to national geographic information security, only a few government departments or confidential enterprises, colleges and universities with authority have the data, and generally few public data sets with good labels exist, which also brings a lot of difficulties and barriers to the promotion of technical research. In recent years, government departments and enterprises have held some games, which provide desensitized and secret-removed data sets, attract more researchers in deep learning direction to participate, and promote technical progress in the field. Currently, the difficulty of the core technology in this field is still in the performance of the classification result at the pixel level. Because the remote sensing image has many same-spectrum foreign matters (namely, the pixel values are the same but the categories are different), and the same-object different-spectrum (namely, the pixel values of the same target are different due to the conditions of shooting equipment, height, illumination, weather, date and the like), the method brings great challenges to the performance improvement of the remote sensing image segmentation technology, and is a great difficulty to be overcome in the field of the current remote sensing image.
The existing remote sensing image segmentation method mostly focuses on the remote sensing image segmentation technology, namely, the classification performance of the pixel level is improved, and a mature commercial product is not yet produced.
Disclosure of Invention
The method is based on the deep learning semantic segmentation technology, and utilizes the self remote sensing image data set to realize intelligent interpretation of the remote sensing image. The remote sensing image is input, and the interpretation result of the plane vector (Polygon) can be obtained. Fig. 1 is a diagram of the overall network structure and processing flow of the present invention, and fig. 2 is a diagram of the superposition display of the vector result output by the present invention and the input remote sensing image.
In order to achieve the purpose, the technical scheme of the invention is as follows:
an intelligent remote sensing image interpretation method comprises the following steps after a remote sensing image is acquired:
s1, segmenting the original remote sensing image as required to obtain a plurality of target images; wherein at the time of segmentation, adjacent target images obtained are caused to have an overlapping portion;
s2, classifying all the obtained target images pixel by adopting a deep learning method to generate a prediction classification result gray level image; the method specifically comprises the following steps:
respectively sending the target image into three different semantic segmentation deep learning models, namely a DeepLabv3, a HRNetW48+ OCR and an EMANet network, then determining whether any pixel belongs to a certain category by using an integrated learning method and voting according to results output by the three networks respectively, wherein the determination result of the number of votes obtained is the final category result of the pixel; meanwhile, the semantic segmentation deep learning model also outputs a probability map, namely confidence coefficient, of each pixel belonging to a certain category;
s3, splicing the prediction classification results of all target images according to the positions of the target images in the original images to obtain a prediction result gray-scale image of the original remote sensing image; splicing the probability graph in the same way;
s4, connecting the long and narrow strip-shaped broken elements in the prediction result gray-scale image;
s5, filtering small patches in the prediction result gray scale image, specifically filling independent patches with the number of pixels smaller than a category designated threshold with the categories of the pixels around the independent patches;
s6, converting the prediction result gray-scale image obtained in the step S5 into a vector shp file, and meanwhile, calculating the confidence coefficient and the area of each image spot as the attributes of the image spots;
s7, simplifying and smoothing the boundary of the vector shp file region by using a Douglas-Peucker algorithm in open source software GrassGis, and eliminating the sawtooth effect of the edges of the ground objects, thereby obtaining the interpretation result of the remote sensing image.
The invention has the beneficial effects that: the invention can realize intelligent interpretation of the remote sensing image, input the grid (pixel) remote sensing image, and directly output the vector interpretation result, thereby greatly reducing the difficulty and workload of manual interpretation, greatly improving the interpretation speed and realizing the interpretation automation.
The method of the invention leads to three links of remote sensing image segmentation, grid vectorization and vector simplification and smoothing, realizes an automatic remote sensing intelligent interpretation system, outputs the ground feature information represented by the vector, and greatly improves interpretation efficiency and interpretation automation. The remote sensing image with any size is input, and the corresponding vector interpretation result can be directly output by the intelligent interpretation method.
Drawings
FIG. 1 is a process flow and a core processing module of the intelligent interpretation method for remote sensing images of the invention.
Fig. 2 is a display of the superposition of the vector result interpreted by the system and the input remote sensing image.
FIG. 3 is a diagram illustrating the results after the grid is converted into vectors.
FIG. 4 is an enlarged comparison of partial results of the results after vector simplified smoothing.
Detailed Description
The invention is described in detail below with reference to the drawings, and certain processes in the flow are shown.
The invention can process the automatic interpretation of the remote sensing images of eight-classification and twenty-five-classification categories. The classification division criteria and corresponding numerical codes are shown in table 1 (numerals 1-8 represent eight classification criteria, others are 25 classification criteria):
TABLE 1 class division Standard and corresponding numerical codes
Figure BDA0003445361930000041
FIG. 1 is a general framework and processing flow of the intelligent interpretation method for remote sensing images of the present invention. The overall process is divided into four stages, and the first stage is a remote sensing image segmentation prediction stage. In the stage, the remote sensing image is cut and independently predicted, a prediction result graph and a probability graph are generated, and sub-graph results are spliced and fused. And the second stage, namely, the post-processing stage of the segmentation gray scale result. The method comprises two modules, wherein one module is connected with a fracture element module, and the other module is used for filtering small patches under a class specified threshold value. The third stage is a vectorization stage. And completing the vector conversion of the grid, and adding confidence coefficient and area attribute for each vector image spot. And in the fourth stage, the vector simplification smoothing module. The simplified smoothing of the vector is realized by utilizing a Douglas-Peucker simplified algorithm in GrassGis, the edge saw teeth are eliminated, and meanwhile, the correct topological structure is ensured. The method specifically comprises the following steps:
the first stage, the semantic segmentation stage of the remote sensing image. In the stage, the remote sensing image is classified pixel by pixel mainly by using a deep learning algorithm to generate a classification result gray level image, and the pixel value is a category code. In the field of remote sensing image processing, the gray scale map of the pixel classification result is also called a grid result map. Generally, remote sensing images are divided in administrative units, for example, remote sensing images in a certain county area are generally large in size, for example, the image width and height are 20000x20000 pixel size. Because the original large-size image cannot be accommodated at one time due to the limitation of computing conditions such as a computer memory and a display card memory, the large-size image is cut into 512x512 small-scale images (the specific size can be adjusted according to the limitation of the computer memory and the display card memory), the images are respectively sent to a network for prediction, and then the prediction results of the small-scale images are spliced. The invention provides the capability of flexibly setting the small-size image which can be processed by the computer according to the configuration of the computer hardware, so that the system is endowed with the capability of processing the large-size remote sensing image. The format of the input image is the original image of the remote sensing image, for example, the remote sensing image with the tif or img format as the suffix with geographic coordinates and spatial reference information.
When the images are cut, the adjacent small-size images need to have overlapping parts, because if the images are not overlapped at all, the situation that the prediction effect is inconsistent at the edge of the images is caused because each small-size image is predicted by a network independently. For example, if a building is exactly divided into two images, then a distinct stitching trace is generated during stitching. The solution of the present invention is that, during clipping, adjacent images are overlapped, for example, two adjacent small images are overlapped by 200 pixels, and then the overlapping area is averaged on the predicted output result, so that the obtained prediction result can greatly reduce the splicing trace.
In the segmentation core technology, in order to improve the performance of the model, the invention integrates 3 models, namely each small-size remote sensing image passes through three different models, namely deep Labv3, EMANT and HRNetW48+ OCR. And then, respectively predicting by the three models, and then voting whether the same pixel position belongs to a certain category, namely when two or more models predict that the pixel belongs to the certain category, determining the position as the category, otherwise, not determining the position as the category. Ensemble learning is a common technique for improving model performance in machine learning.
Meanwhile, in order to facilitate the subsequent manual intervention to correct the vector result predicted by the model, the segmentation model also outputs a probability map, also called confidence, of which each pixel belongs to a certain class. After vectorization, the confidence of each vector image spot is the average value of the probabilities of the pixels under the image spot. When the vector image spot is modified by human intervention, the vector image spot with low confidence coefficient can be focused for manual modification.
In addition, the remote sensing images have uneven distribution of various categories, for example, the category ratio of the eight categories divided into intertillage areas is close to 40%, and the ratio of the garden area to the water area is only about 3-4%, so that the prediction effect of the categories with less category ratio is poor in the model training process. To solve this problem, the present invention employs Focal Loss Loss function proposed In Focal Loss for detect Object Detection by Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollar et al, while introducing the channel Attention mechanism proposed In Squese-and-Attention Networks by Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu et al, and the idea of using the space Attention mechanism together with the channel Attention mechanism proposed In CBAM Block Attention Module by Sanghan Wo, journal Park, Joon-Young, and In So Kwenon et al, to further reduce the impact of the type unevenness on the model performance.
And finally, splicing the prediction results of all the small images at the stage into a prediction result gray-scale image with the same width and height as the original input remote sensing image.
And the second stage, a segmentation result post-processing stage. The model needs to process the output result of the remote sensing image in two aspects. First, it was found that, as a result of prediction, in the case where a long strip-like element such as a road or a river is broken in many cases, it is necessary to connect the strip-like elements. In the second aspect, the prediction result includes a number of scattered independent elements, and is filtered out in a predetermined area (number of pixels) or less. The invention adopts the scheme that other pixels around a small image spot are used for filling. The various class filtering criteria specified in this invention are shown in table 2, table 3:
TABLE 2 eight classes correspond to the filtering threshold
Figure BDA0003445361930000061
TABLE 3 Twenty-five classification corresponds to the filtering threshold
Figure BDA0003445361930000071
And after the connection of the broken elements and the filtering of small pattern spots are completed, the next stage of processing is carried out.
And the third stage, namely a grid vectorization stage. Namely, the grid classification result of the second stage post-processing is converted into a stage represented by vector Polygon surface (Polygon). The region outline polygon is used for the connected regions having the same pixel value. I.e. grid vectorization, into a representation of a face element. Because the output of the second stage is only in an image format and does not contain information such as spatial reference and geographic coordinates of the image, the spatial reference and geographic coordinate information of the original remote sensing image need to be added to the result of the second stage before vectorization. The invention realizes vectorization by using the open source library Gdal. Vectorization techniques typically generate vector polygons along pixel edges. This approach may introduce a large number of jagged edges to the vectorized result. But has the advantage of no topology errors. That is, there is no overlap or gap between surface vector polygons, and the method is a mainstream commercial vector method. In the vectorization process, a confidence coefficient attribute is added to each image spot, and the calculation of the confidence coefficient is derived from a probability graph output by remote sensing image segmentation and is the average value of the probability of corresponding pixels under the image spots. Meanwhile, the area of each vector image spot is calculated and used as the area attribute of the image spot.
And a fourth stage, namely a vector simplification smoothing stage. After the third-stage vectorization, the result has jaggies, and it is required to remove unnecessary jaggy edges as much as possible, for example, a straight edge represented by a staircase-shaped jaggy is represented by two vertexes of a straight line, and an intermediate unnecessary jaggy point is removed. The arcs, represented by the individual immediately adjacent serrations, are represented using relatively smooth arcs. The techniques herein generally involve two types, one is a simplification technique and one is a smoothing technique. For the simplified technology, namely, the algorithm is adopted, points are deleted as much as possible, and fewer points are used for representing the edge, so that the effect of a sawtooth edge straight line is achieved. Another technique is smoothing, which starts with the use of smooth curves instead of jagged edge lines. The most common problem in these two techniques is that during the simplification or smoothing process, topological errors, i.e. gaps or overlaps between adjacent surface vectors, can occur. The result of topology errors can lead to statistical errors and can not be used. After a large amount of research, the method is realized by adopting a Douglas-Peucker simplified algorithm provided in a v.generalize module in open source software GrassGis, and can eliminate the saw teeth and keep the correctness of the topological structure. Fig. 3 and 4 illustrate the effect of the algorithm on the simplified smoothing of jagged edges.
The method is based on the core artificial intelligence algorithm, and the method is opened in four stages of remote sensing image segmentation, segmentation result post-processing, grid vectorization, vector simplification and smoothing and the like, so that intelligent interpretation of the large-size remote sensing image is realized, and only manual intervention is needed to modify the prediction result with low confidence coefficient in the later stage, so that the intensity of manual interpretation is greatly reduced, and the interpretation efficiency is improved. Experiments prove that about 2 weeks is needed for manually interpreting a common medium-sized county, and the method can obtain results within 2 hours, so that the interpretation efficiency is greatly improved. According to the knowledge, no successful remote sensing image automatic interpretation product exists in the market, and the remote sensing intelligent interpretation system fills the market blank.

Claims (1)

1. An intelligent remote sensing image interpretation method is characterized in that after a remote sensing image is acquired, the interpretation method comprises the following steps:
s1, segmenting the original remote sensing image as required to obtain a plurality of target images; wherein at the time of segmentation, adjacent target images obtained are caused to have an overlapping portion;
s2, classifying all the obtained target images pixel by adopting a deep learning method to generate a prediction classification result gray level image; the method specifically comprises the following steps:
respectively sending the target image into three different semantic segmentation deep learning models, namely a DeepLabv3, a HRNetW48+ OCR and an EMANet network, then determining whether any pixel belongs to a certain category by using an integrated learning method and voting according to results output by the three networks respectively, wherein the determination result of the number of votes obtained is the final category result of the pixel; meanwhile, the semantic segmentation deep learning model also outputs a probability map, namely confidence coefficient, of each pixel belonging to a certain category;
s3, splicing the prediction classification results of all target images according to the positions of the target images in the original images to obtain a prediction result gray-scale image of the original remote sensing image; splicing the probability graph in the same way;
s4, connecting the long and narrow strip-shaped broken elements in the prediction result gray-scale image;
s5, filtering small patches in the prediction result gray scale image, specifically filling independent patches with the number of pixels smaller than a category designated threshold with the categories of the pixels around the independent patches;
s6, converting the prediction result gray-scale image obtained in the step S5 into a vector shp file, and meanwhile, calculating the confidence coefficient and the area of each image spot as the attributes of the image spots;
s7, simplifying and smoothing the boundary of the vector shp file region by using a Douglas-Peucker algorithm in open source software GrassGis, and eliminating the sawtooth effect of the edges of the ground objects, thereby obtaining the interpretation result of the remote sensing image.
CN202111654181.6A 2021-12-30 2021-12-30 Intelligent interpretation method for remote sensing image Pending CN114299394A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111654181.6A CN114299394A (en) 2021-12-30 2021-12-30 Intelligent interpretation method for remote sensing image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111654181.6A CN114299394A (en) 2021-12-30 2021-12-30 Intelligent interpretation method for remote sensing image

Publications (1)

Publication Number Publication Date
CN114299394A true CN114299394A (en) 2022-04-08

Family

ID=80973266

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111654181.6A Pending CN114299394A (en) 2021-12-30 2021-12-30 Intelligent interpretation method for remote sensing image

Country Status (1)

Country Link
CN (1) CN114299394A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115578607A (en) * 2022-12-08 2023-01-06 自然资源部第三航测遥感院 Method for rapidly extracting coverage area of effective pixels of remote sensing image
CN117349462A (en) * 2023-12-06 2024-01-05 自然资源陕西省卫星应用技术中心 Remote sensing intelligent interpretation sample data set generation method
CN118093940A (en) * 2024-01-03 2024-05-28 河北省第一测绘院 Remote sensing image interpretation method, device, server and storage medium
CN118093940B (en) * 2024-01-03 2024-11-05 河北省第一测绘院 Remote sensing image interpretation method, device, server and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115578607A (en) * 2022-12-08 2023-01-06 自然资源部第三航测遥感院 Method for rapidly extracting coverage area of effective pixels of remote sensing image
CN117349462A (en) * 2023-12-06 2024-01-05 自然资源陕西省卫星应用技术中心 Remote sensing intelligent interpretation sample data set generation method
CN117349462B (en) * 2023-12-06 2024-03-12 自然资源陕西省卫星应用技术中心 Remote sensing intelligent interpretation sample data set generation method
CN118093940A (en) * 2024-01-03 2024-05-28 河北省第一测绘院 Remote sensing image interpretation method, device, server and storage medium
CN118093940B (en) * 2024-01-03 2024-11-05 河北省第一测绘院 Remote sensing image interpretation method, device, server and storage medium

Similar Documents

Publication Publication Date Title
CN111986099B (en) Tillage monitoring method and system based on convolutional neural network with residual error correction fused
CN111914795B (en) Method for detecting rotating target in aerial image
CN110598690B (en) End-to-end optical character detection and recognition method and system
Wang et al. Photovoltaic panel extraction from very high-resolution aerial imagery using region–line primitive association analysis and template matching
CN102592268B (en) Method for segmenting foreground image
CN103049763B (en) Context-constraint-based target identification method
CN114299394A (en) Intelligent interpretation method for remote sensing image
CN102722712A (en) Multiple-scale high-resolution image object detection method based on continuity
CN114241326B (en) Progressive intelligent production method and system for ground feature elements of remote sensing images
CN112990086A (en) Remote sensing image building detection method and device and computer readable storage medium
Jiao et al. A survey of road feature extraction methods from raster maps
CN110781882A (en) License plate positioning and identifying method based on YOLO model
CN111462140B (en) Real-time image instance segmentation method based on block stitching
CN111027538A (en) Container detection method based on instance segmentation model
CN111402224A (en) Target identification method for power equipment
CN109635726A (en) A kind of landslide identification method based on the symmetrical multiple dimensioned pond of depth network integration
CN115661072A (en) Disc rake surface defect detection method based on improved fast RCNN algorithm
CN111461006A (en) Optical remote sensing image tower position detection method based on deep migration learning
CN114511627A (en) Target fruit positioning and dividing method and system
CN116778137A (en) Character wheel type water meter reading identification method and device based on deep learning
CN118172334A (en) Electric network wiring diagram electric element cascade detection method based on transducer and residual convolution
CN113012167B (en) Combined segmentation method for cell nucleus and cytoplasm
CN115019163A (en) City factor identification method based on multi-source big data
CN104463091A (en) Face image recognition method based on LGBP feature subvectors of image
CN116052110B (en) Intelligent positioning method and system for pavement marking defects

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination