CN113052121B

CN113052121B - Multi-level network map intelligent generation method based on remote sensing image

Info

Publication number: CN113052121B
Application number: CN202110377329.XA
Authority: CN
Inventors: 付莹; 方政
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2021-04-08
Filing date: 2021-04-08
Publication date: 2022-09-06
Anticipated expiration: 2041-04-08
Also published as: CN113052121A

Abstract

The invention discloses a multi-level network map intelligent generation method based on remote sensing images, and belongs to the technical field of computer vision. The method uses a preliminary generation algorithm model to generate a preliminary network map, and the model expands the hierarchy number into a normalized identification of the image size to assist the model in learning the drawing characteristics of maps of different hierarchies, so that the model can accurately generate the network map with different drawing characteristics of each hierarchy according to remote sensing images with similar contents of different hierarchies, and generate the multi-hierarchy network map with detailed and reasonable differences. And generating a refined network map by using a map improvement algorithm model, wherein the model uses a high-level refined network map to assist the generation of a low-level map, so that the model learns the consistency among maps of different levels, and the consistency among regional network maps corresponding to different levels is ensured. According to the method, the generation of the network map can be automatically completed according to the input image without manually collecting ground vector data and only by using a remote sensing image shot by aviation or satellite.

Description

Multi-level network map intelligent generation method based on remote sensing image

Technical Field

The invention relates to an intelligent generation method of a multi-level network map, in particular to an intelligent generation method of a multi-level network map based on a remote sensing image, and belongs to the technical field of computer vision.

Background

The remote sensing technology is a non-contact and remote sensing technology, and films or photos obtained by the remote sensing technology and recording the size of electromagnetic waves of various ground objects are called remote sensing images. The collection of remote sensing image can be accomplished through high altitude equipments such as unmanned aerial vehicle, aircraft and satellite, and it is fast to have an update rate, collects the characteristics that the cost is lower relatively, and the collection process is not influenced by ground condition, has good adaptability. The remote sensing image contains a large amount of ground electromagnetic wave information, the distribution conditions of ground objects such as roads, water areas, buildings and the like can be reflected, and information required by map drawing can be extracted and obtained from the distribution conditions.

The multi-level network map is a network map adopting a multi-resolution level model, has the advantages of flexible use and convenient transmission, and is used by more and more network map services. The multi-level network map generally adopts a tile map pyramid model structure, and the resolution is lower and lower from the bottom layer (the k level) to the top layer (the 0 level) of the tile pyramid, but the represented geographic range is unchanged. Specifically, the distance pixel ratio of the k-th layer tile map is half of that of the k-1-th layer, so that the tile map has a larger scale and can display finer content. In the multi-level network map, the geographic elements contained in different levels of maps are consistent, and the detailed levels of the maps are different, so that the multi-level network map has consistency and difference among levels.

In a traditional multi-level network map, map images of all layers are usually rendered according to certain drawing standards according to map vector data, and the map vector data are usually acquired manually on site, so that the efficiency and the cost are greatly limited. Considering that the remote sensing image is fast in acquisition speed and low in collection cost, the method for automatically generating the network map according to the remote sensing image becomes a feasible solution. Existing methods typically view this task as an image semantic segmentation task or an image conversion task. The image semantic segmentation task aims to classify each pixel in the image according to the object class of the pixel, and the pixels of the remote sensing image can be classified according to the ground object class by using the correlation technology and are marked by different colors to form a network map. The image conversion task generally refers to converting an image of one style into an image of another style, and preserving structural information of the images, and the remote sensing image style image can be converted into a network map style image by using a related technology.

However, in the conventional method, only a single-level map is considered to be generated, and if a plurality of different levels are generated by using the method, consistency and difference between the maps of the different levels cannot be grasped, so that it is difficult to generate a multi-level network map with accurate and consistent information expression and good visual effect.

Disclosure of Invention

The invention aims to provide a multi-level network map generation method based on remote sensing images, which is based on the requirements of the existing network map service on multi-level network maps and aims at solving the technical problems that the traditional multi-level network map generation method is high in cost, low in efficiency and difficult to update in real time in emergency situations, and the existing network map intelligent generation method only considers single-level maps and cannot grasp the consistency and difference among different-level maps. The method has the advantages of strong adaptability, high generation speed, high result accuracy, good visual effect and good consistency and difference among layers.

The innovation points of the invention are as follows: the method includes a training phase and a use phase. In the training stage, firstly, the network map in the training data set of the remote sensing image-network map pair is clustered to the pixel color values, and the ground feature type mask corresponding to the network map is solved. And then, training a preliminary generation algorithm model by mixedly using each level remote sensing image, corresponding level information, a corresponding real ground object class mask and a corresponding real network map. And respectively inputting all the remote sensing images and the corresponding level numbers in the training set into the trained preliminary generation algorithm model, and generating a preliminary network map of each level for storage and standby. And then, training a map improvement algorithm model by sequentially utilizing the high-level to low-level remote sensing images, the preliminary network map, the real ground object class mask and the real network map.

In the using stage, firstly, if the acquired remote sensing image is a single-level image, the remote sensing image needs to be expanded into a multi-level image. And then sequentially inputting the remote sensing images of all levels and the corresponding level information into a trained preliminary generation algorithm model, and generating and storing a corresponding preliminary network map. And then, based on the remote sensing images of each level and the preliminary network map, sequentially generating a refined network map of each level from high to low by using a trained map improvement algorithm model. And finally, splicing the generated each-layer fine-trimming network map into a multi-layer network map according to the number.

The technical scheme adopted by the invention is as follows:

a multi-level network map generation method based on remote sensing images comprises the following steps:

step 1: and (5) a training stage.

Specifically, the method comprises the following steps:

step 1.1: and clustering the pixel color values of the network map in the training data set of the remote sensing image-network map pair to obtain a ground feature type mask corresponding to the network map.

The specific method comprises the following steps:

firstly, clustering all pixel points of real network map data in a training set by using a clustering algorithm, solving a class number corresponding to each pixel, and corresponding each class number to an expressed ground feature semantic class.

And then, restoring the semantic categories corresponding to the pixels according to the original positions of the pixels in the network map, generating real ground object category masks corresponding to the real network map one by one and storing the masks.

Step 1.2: and training a preliminary generation algorithm model by using the remote sensing images of all levels, corresponding level information, corresponding real ground object class masks and corresponding real network maps in a mixed mode.

The specific method comprises the following steps:

randomly selecting a remote sensing image from the training data set, normalizing the corresponding level number by dividing the remote sensing image by the total level number K, and inputting the remote sensing image and the normalized level number into a preliminary generation algorithm model. The model outputs a prediction result of a ground object type mask and a prediction result of a network map.

The size of the prediction result of the ground object type mask is consistent with that of the input remote sensing image, the solution space of each pixel is all integers of [0, (n-1) ], each integer represents one ground object type, and n is the total number of the ground object types. The prediction result of the network map is a network map picture in an RGB format, and the size of the network map picture is consistent with that of the input remote sensing image. And comparing the ground feature type mask prediction result and the network map prediction result output by the model with the real ground feature type mask and the real network map respectively, calculating a loss function, reversely propagating the loss value, and updating the parameters in the preliminarily generated algorithm model. And continuously repeating the process until the set iteration times are met, and storing the structure and the model parameters of the network to obtain the trained preliminary generation algorithm model structure and parameters.

The preliminary generated algorithm model includes two modules: the map drawing system comprises a first semantic extraction module and a first map drawing module.

When the remote sensing image is input into a preliminary generation algorithm model, the remote sensing image firstly passes through a first semantic extraction module which is a full convolution network. The first semantic extraction module can perform optimization by using a cross entropy loss function, wherein the minimized cross entropy loss function formula is as follows:

wherein theta is a model parameter of the first semantic extraction module F, the output of the first semantic extraction module F is a segmentation result and a feature map before the segmentation result, and x belongs to R ^N×H×W Referring to the input remote sensing image, N, H and W represent the number of image channels, height and width, respectively. s is formed by R ^C×H×W For the true values of the semantic segmentation, C, H and W represent the number, height and width, s, of channels representing the true values of the semantic segmentation, respectively _i Segmenting the true value, s, of an i-th class of interest object for semantics _i The position of taking 1 represents that the point is the object of interest of the corresponding category, and 0 represents not. F _θ (x) _i Is firstAnd the semantic extraction module is used for predicting the confidence of the ith type of interest target.

In addition, the model can select different loss functions such as a Focal loss function, a Lov-sz loss function and the like according to different specific details and different training data sets.

And then, the first mapping module receives output information (including a mask and a characteristic diagram), an original remote sensing image and a corresponding hierarchy number of the semantic extraction module at the same time, and generates a preliminary network map in an RGB format. The hierarchy number is normalized by dividing by the total number of hierarchies K, and then is filled with a vector of 1 × H × W, where H and W are the height and width of the input remote sensing image, respectively, and the vector represents hierarchy information.

The first mapping module is a conditional-based generation countermeasure network which performs supervised learning by using the result truth value of the target domain, and comprises a generator and a discriminator, wherein the generator and the discriminator perform countermeasure training: the generator generates synthetic data according to a given condition, and the discriminator distinguishes the generated data from the real data of the generator. The generator tries to produce data as close to the real as possible and accordingly the arbiter tries to perfectly distinguish the real data from the generated data. In this process, the discriminator takes the form of a loss function learned from the image data and instructs the generator to generate an image. The module may use the basis loss function as:

wherein phi is,

Are the parameters of generator G and discriminator D, x, y ∈ R ^C×H×W X is remote sensing image, y is real network map, C, H, W represents image channel number, height and width, p _data (x)、p _data And (y) data distribution of the remote sensing image and the real network map. And k represents the zoom level number to which the remote sensing image belongs, and the number is input into the model and then expanded into level information. F _θ (x) And outputting the mask and the feature map for the first semantic extraction module. E represents the mathematical expectation.

In addition, different loss functions such as a reconstruction loss function, a feature matching loss function, a perception loss function, a multi-size discriminator loss function and the like can be selected according to different specific details of the model and different training data sets.

Step 1.3: and inputting all the remote sensing images and the corresponding level numbers in the training set into the trained preliminary generation algorithm model respectively, generating a preliminary network map of each level, and storing for later use.

The specific method comprises the following steps:

and (3) establishing a preliminary generation algorithm model according to the preliminary generation algorithm model structure and parameters stored in the step (1.2), inputting the remote sensing image and the corresponding hierarchy information into the model, and storing a preliminary network map output by the model. The preliminary network map generation formula is shown as follows:

y′＝G _φ (x,F _θ (x),k) (3)

wherein y' is a preliminary network map, x is a remote sensing image, k represents a zoom level number to which the remote sensing image belongs, the number is expanded into level information after being input into a model, and F _θ (x) And G represents a generator, and phi represents generator parameters.

Step 1.4: and sequentially utilizing the remote sensing images of all levels from high to low, the preliminary network map, the real ground feature class mask and the real network map training map to improve the algorithm model.

The specific method comprises the following steps:

for a multi-level network map training data set containing K levels, the data levels are all integers in {0,1, …, K-1} numbered. First, the preliminary network map of the K-1 layer is used as the refined network map of the layer. And then, taking K to be K-1, randomly selecting a K-1 layer remote sensing image, a K-1 layer preliminary network map corresponding to the K-1 layer remote sensing image and 4 refined network maps corresponding to the K layer remote sensing image to input into a map improvement algorithm model, generating a K-1 layer ground object type mask prediction result and a K-1 layer network map prediction result, comparing the K-1 layer ground object type mask prediction result with a real ground object type mask and a real network map respectively, calculating a loss function and updating parameters in the map improvement algorithm model according to the loss function.

And repeating the previous step until the set iteration times are met, and generating and storing a corresponding k-1 layer refined network map for all remote sensing images of the k-1 layer by using the current map improvement algorithm model. Then, all integers with K being {1,2, …, K-2} are taken from large to small. And repeating the training process for each k value to finish the training of the map improvement algorithm model.

The map improvement algorithm model comprises a second semantic extraction module and a second map drawing module.

When the remote sensing image of the k-1 th layer is input into a map improvement algorithm model, the remote sensing image firstly passes through a semantic extraction module which is a full convolution network. The second semantic extraction module may perform optimization using a cross entropy loss function, wherein the minimized cross entropy loss function formula is:

wherein theta 'is a model parameter of the second semantic extraction module F', and the second semantic extraction module F ^′ The output of (2) is a segmentation result and a feature map before the segmentation result, and x belongs to R ^N×H×W Referring to the input remote sensing image, N, H, W represents the number of image channels, height and width, respectively. s is formed by R ^C×H×W For the true values of the semantic segmentation, C, H and W represent the number, height and width, s, of channels representing the true values of the semantic segmentation, respectively _i Segmenting the true value, s, of an i-th class of interest object for semantics _i The position of taking 1 represents that the point is the object of interest of the corresponding category, and 0 represents not. F' _θ′ (x) _i And predicting confidence of the second semantic extraction module on the ith type of interest target.

And then, the second map drawing module receives output information (a mask and a feature map) of the second semantic extraction module, the preliminary network map of the k-1 layer corresponding to the remote sensing image and the 4 refined network maps corresponding to the k layer at the same time, and generates the refined network map corresponding to the remote sensing image.

The second map drawing module is a condition-based generation confrontation network which uses the result truth value of the target domain for supervised learning and comprises a generator and a discriminator, wherein the generator and the discriminator are trained in confrontation: the generator generates synthetic data according to a given condition, and the discriminator discriminates the generated data from the real data of the generator. The generator tries to produce data as close to the real as possible and accordingly the arbiter tries to perfectly distinguish the real data from the generated data. In this process, the discriminator acts as a loss function by learning the image data, and directs the generator to generate the image. Through mutual game learning of the generator and the discriminator, the generator can finally generate generated data meeting quality requirements. The basis loss function used by this module is:

wherein phi

The parameters of the generator G and the discriminator D are respectively, x, y, y' are belonged to R ^C×H×W X is a remote sensing image, y is a real network map, y' is a preliminary network map, C, H, W respectively represents the number, height and width of image channels, subscript k-1 and subscript k represent the zoom level of the map, and p _data (x)、p _data (y) and p _data (y') respectively representing the data distribution of the remote sensing image, the real network map and the preliminary network map,

the refined network map representing the k-th layer is spliced and downsampled to form an image with the same size as the remote sensing image,

y′ _k-1 、y _k-1 and x _k-1 The actual geographical area represented is the same in location and size. F' _θ′ (x _k-1 ) And the mask and the feature map output by the second semantic extraction module. E represents the mathematical expectation.

In addition, different loss functions such as a reconstruction loss function, a feature matching loss function, a perception loss function, a multi-size discriminator loss function and the like can be selected and used according to different specific details of the model and different training data sets.

And 2, step: and (4) a use stage.

Specifically, the method comprises the following steps:

step 2.1: if the collected remote sensing image is a single-level image, the remote sensing image is expanded into a multi-level image.

The specific method comprises the following steps:

and (3) regarding the collected single-layer remote sensing image as a kth layer, numbering all tiles, splicing every two adjacent 2 x 2 remote sensing image tiles, down-sampling by methods such as interpolation and the like until the size of each tile is the same as that of the original single tile, and processing all remote sensing image tiles on the kth layer to obtain the kth-1 layer remote sensing image.

And repeating the steps, and iteratively generating the images of each layer of k-2, k-3 and the like until the number of the images of the layer is small enough (within 20 sheets for example) or the images of the layer have only one row or only one column. And after the generation of the remote sensing images of each layer is finished, taking the lowest layer image as the 0 th layer, and numbering each layer again.

Step 2.2: and sequentially inputting the remote sensing images of all levels and the corresponding level information into a trained preliminary generation algorithm model, and generating and storing a corresponding preliminary network map.

The specific method comprises the following steps:

and creating a network model according to the preliminarily generated algorithm model structure and parameters stored in the training stage, inputting the remote sensing image and the hierarchical information into the model, predicting the model through a first semantic extraction module and a first language map drawing module respectively, and automatically storing a preliminary network map finally generated by the first language map drawing module, wherein the network map is in an RGB image format, and the size of the preliminary network map is consistent with that of the input remote sensing image tile.

The preliminary network map generation formula is as follows:

y′＝G _φ (x,F _θ (x),k) (6)

wherein y' is a preliminary network map, x is a remote sensing map, k represents a zoom level number to which the remote sensing image belongs, the number is expanded into level information after being input into a model, and F _θ (x) And G represents a generator module, and phi represents generator module parameters.

Step 2.3: and based on the remote sensing images of all levels and the preliminary network map, sequentially generating the fine network map of each level from high to low by using a trained map improvement algorithm model.

For a multi-level remote sensing image data set containing K levels, the data levels are all integers in {0,1, …, K-1} numbered. First, the preliminary network map of the K-1 layer is used as the refined network map of the layer. And then, creating a network model according to a map improvement algorithm model structure and parameters saved in a training stage, taking K as K-1, respectively inputting the K-1 layer remote sensing image, the K-1 layer preliminary network map and the K layer refined network map into the model for operation, and generating and storing a corresponding K-1 layer refined network map. And (3) sequentially taking all integers with K being {1,2, …, K-2} from large to small, repeating the above process for each K value, and finishing the generation of the network map refined by all the levels.

The generation formula of the refined network map is as follows:

where φ 'is a parameter of the generator module G, x, y, y' e.g. R ^C×H×W X is a remote sensing image, y is a real network map, y' is a preliminary network map, C, H, W respectively represents the number, height and width of image channels, subscripts k and k-1 represent the zoom level of the map,

the representation generator generates a layer k-1 refinement network map,

y′ _k-1 and x _k-1 The actual area locations and sizes of the representations are the same. F' _θ′ (x _k-1 ) And outputting the mask and the feature map for the second semantic extraction module.

Step 2.4: and after generating each layer of network map block by block, splicing the generated network maps according to the sequence number to obtain a complete multi-layer network map.

The method can generate the multi-level network map on any scale.

Advantageous effects

Compared with the prior art, the method of the invention has the following advantages:

1. the method uses a preliminary generation algorithm model to generate a preliminary network map, the model expands the level information into a normalized identification of the image size, and assists the model to learn the drawing characteristics of maps of different levels, so that the model can accurately generate the network map with different drawing characteristics of each level according to remote sensing images with similar contents of different levels, and generate the multi-level network map with detailed details and reasonable differences.

2. The method uses a map improvement algorithm model to generate the refined network map, and the model uses a high-level refined network map to assist the generation of a low-level map, so that the model learns the consistency among maps of different levels, and the consistency among regional network maps corresponding to different levels is ensured.

3. The method can reduce the generation cost of the network map. Compared with the traditional generation of a multi-level network map, the method does not need to manually collect ground vector data, only needs to utilize a remote sensing image shot by aviation or a satellite, and the generation of the network map can be automatically completed according to an input image.

Drawings

FIG. 1 is a flow chart of the method of the present invention.

Fig. 2 is a schematic diagram of multi-level network image generation by the method of the present invention.

FIG. 3 is a schematic diagram of the internal details of the multi-level network image generation by the core algorithm model according to the method of the present invention.

Detailed Description

For a better understanding of the objects and advantages of the present invention, reference should be made to the following detailed description taken in conjunction with the accompanying drawings and examples.

Examples

A multi-level network map generation method based on remote sensing images comprises two stages of training and using

Step 1: and (5) a training stage.

In the training stage, the network map in the training data set of the remote sensing image-network map pair is used for clustering the pixel color values, and the ground object type mask corresponding to the network map is solved; training a preliminary generation algorithm model by mixedly using each level remote sensing image, corresponding level information, corresponding real ground object class masks and corresponding real network maps; respectively inputting all remote sensing images and corresponding level numbers in the training set into a trained preliminary generation algorithm model, and generating a preliminary network map of each level for storage and standby; and sequentially utilizing the remote sensing images of all levels from high to low, the preliminary network map, the real ground object class mask and the real network map to train the map improvement algorithm model.

Specifically, step 1 comprises the steps of:

The specific method comprises the following steps:

clustering all pixel points of real network map data in the training set by using a clustering algorithm, solving a class number corresponding to each pixel, and corresponding each class number to an expressed ground feature semantic class; and then, restoring the semantic categories corresponding to the pixels according to the original positions of the pixels in the network map, generating real ground object category masks corresponding to the real network map one by one and storing the masks.

The specific method comprises the following steps:

randomly selecting a remote sensing image from the training data set, normalizing the corresponding level number by dividing the remote sensing image by K (K is the total level number), and inputting the remote sensing image and the normalized level information into a preliminary generation algorithm model. The model outputs a prediction result of a ground object type mask and a prediction result of a network map. The size of the prediction result of the ground object type mask is consistent with that of the input remote sensing image, the solution space of each pixel is all integers of [0, (n-1) ], each integer represents one ground object type, and n is the total number of the ground object types. The prediction result of the network map is a network map picture in an RGB format, and the size of the network map picture is consistent with that of the input remote sensing image. And comparing the ground feature type mask prediction result and the network map prediction result output by the model with the real ground feature type mask and the real network map respectively, calculating a loss function, reversely propagating the loss value, and updating the parameters in the preliminarily generated algorithm model. And continuously repeating the process until the set iteration times are met, and storing the structure and the model parameters of the network to obtain the trained preliminary generation algorithm model structure and parameters.

wherein theta is a model parameter of the first semantic extraction module F, the output of the first semantic extraction module F is a segmentation result and a feature map before the segmentation result, and x belongs to R ^N×H×W Referring to the input remote sensing image, N, H and W represent the number of image channels, height and width, respectively. s is formed by R ^C×H×W For the true values of the semantic segmentation, C, H and W represent the number, height and width, s, of channels representing the true values of the semantic segmentation, respectively _i Segmenting the true value, s, of class i objects of interest for semantics _i The position of taking 1 represents that the point is the object of interest of the corresponding category, and 0 represents not. F _θ (x) _i And predicting confidence of the first semantic extraction module on the ith type of interest target.

And then, the first map drawing module receives the output information (mask and characteristic map) of the first semantic extraction module, the original remote sensing image and the corresponding hierarchy number at the same time, and generates a preliminary network map in an RGB format. The remote sensing image hierarchy number is normalized by dividing by K (K is the total hierarchy number), and then is filled with a vector with the size of 1 multiplied by H multiplied by W, wherein H and W are respectively the height and the width of the input remote sensing image, and the vector represents the hierarchy information.

The first mapping module is a conditional-based antagonistic network which performs supervised learning by using the true result value of the target domain, and comprises a generator and a discriminator, wherein the generator and the discriminator are used for performing antagonistic training: the generator generates synthetic data according to a given condition, and the discriminator distinguishes the generated data from the real data of the generator. The generator tries to produce data as close to the real as possible and accordingly the arbiter tries to perfectly distinguish the real data from the generated data. In this process, the discriminator acts as a loss function learned from the image data, directing the generator to generate the image. The basis loss function used by this module is:

wherein phi is,

Respectively the parameters of the generator G and the discriminator D, x, y belongs to R ^C×H×W X is remote sensing image, y is real network map, C, H, W represents image channel number, height and width, p _data (x)、p _data And (y) data distribution of the remote sensing image and the real network map. And k represents the zoom level number to which the remote sensing image belongs, and the number is input into the model and then expanded into level information. F _θ (x) And outputting the mask and the feature map for the first semantic extraction module.

Step 1.3: and respectively inputting all remote sensing images and corresponding hierarchy numbers in the training set into the trained preliminary generation algorithm model, and generating a preliminary network map of each hierarchy for storage and standby.

The specific method comprises the following steps:

and (3) creating a preliminary generation algorithm model according to the preliminary generation algorithm model structure and parameters stored in the step (1.2), inputting the remote sensing image and the corresponding hierarchy information into the model, and storing a preliminary network map output by the model. The preliminary network map generation formula is shown as follows:

y′＝G _φ (x,F _θ (x),k) (3)

wherein y' is a preliminary network map, x is a remote sensing map, k represents a zoom level number to which the remote sensing image belongs, the number is expanded into level information after being input into a model, and F _θ (x) And G represents a generator, and phi represents generator parameters.

Step 1.4: and sequentially utilizing the remote sensing images of all levels from high to low, the preliminary network map, the real ground object class mask and the real network map to train the map improvement algorithm model.

The specific method comprises the following steps:

for a multi-level network map training data set containing K levels, the data levels are all integers in {0,1, …, K-1} numbered. First, the preliminary network map of the K-1 layer is used as the refined network map of the layer. Then, taking K as K-1, randomly selecting a K-1 layer remote sensing image, a K-1 layer preliminary network map corresponding to the K-1 layer remote sensing image and 4 pieces of fine-finished network maps corresponding to the K layer remote sensing image in an input map improvement algorithm model, generating a K-1 layer ground object type mask prediction result and a K-1 layer network map prediction result, respectively comparing the K-1 layer ground object type mask prediction result with a real ground object type mask and the real network map, calculating a loss function and updating parameters in the map improvement algorithm model according to the loss function; repeating the previous step until the set iteration times are met, and generating and storing a corresponding k-1 layer refined network map for all remote sensing images of the k-1 layer by using the current map improvement algorithm model; then, all integers with K being {1,2, …, K-2} are taken from high to low in sequence, and the training of the map improvement algorithm model is completed after the training process is repeated for each K value.

The map improvement algorithm model includes two modules: the second semantic extraction module and the second map drawing module.

When the remote sensing image of the k-1 th layer is input into the map improvement algorithm model, the remote sensing image firstly passes through a second semantic extraction module which is a full convolution network. The second semantic extraction module may perform optimization using a cross entropy loss function, wherein the minimized cross entropy loss function formula is:

wherein theta ' is a model parameter of the second semantic extraction module F ', the output of the second semantic extraction module F ' is a segmentation result and a feature map before the segmentation result, and x belongs to R ^N×H×W Remote sensing image of finger input, NH, W represent image channel number, height, and width, respectively. s is formed by R ^C×H×W For the true values of the semantic segmentation, C, H and W represent the number, height and width, s, of channels representing the true values of the semantic segmentation, respectively _i Segmenting the true value, s, of an i-th class of interest object for semantics _i The position of taking 1 represents that the point is the object of interest of the corresponding category, and 0 represents not. F' _θ′ (x) _i And predicting confidence of the second semantic extraction module on the ith type of interest target.

The second map drawing module is a condition-based generation countermeasure network which utilizes the result truth value of the target domain to carry out supervised learning and comprises a generator and a discriminator, wherein the generator and the discriminator carry out the training of the countermeasure: the generator generates synthetic data according to a given condition, and the discriminator distinguishes the generated data from the real data of the generator. The generator tries to produce data as close to the real as possible and accordingly the arbiter tries to perfectly distinguish the real data from the generated data. In this process, the discriminator acts as a loss function learned from the image data, directing the generator to generate the image. Through mutual game learning of the generator and the discriminator, the generator can finally generate generated data meeting quality requirements. The basis loss function used by this module is:

wherein phi

y′ _k-1 、y _k-1 and x _k-1 The actual geographical area represented is the same in location and size. F' _θ′ (x _k-1 ) And the mask and the feature map output by the second semantic extraction module.

In addition, the model can select different loss functions such as a reconstruction loss function, a feature matching loss function, a perception loss function, a multi-size discriminator loss function and the like according to different specific details of the model and different training data sets.

Step 2: and (4) a use stage.

In the use stage, firstly, if the acquired remote sensing image is a single-level image, the remote sensing image needs to be expanded into a multi-level image; then, inputting the remote sensing images of all levels and the corresponding level information into a trained preliminary generation algorithm model in sequence, and generating and storing a corresponding preliminary network map; based on each level of remote sensing image and the preliminary network map, using a trained map improvement algorithm model to sequentially generate each level of fine network map from high to low; and finally, splicing the generated fine network maps of all levels into a multi-level network map according to the numbers.

Specifically, step 2 comprises the steps of:

step 2.1: if the acquired remote sensing image is a single-level image, the acquired remote sensing image needs to be expanded into a multi-level image.

The specific method comprises the following steps:

and (2) regarding the collected single-layer remote sensing image as a kth layer, numbering all tiles, splicing every adjacent 2 x 2 remote sensing image tiles, then sampling by interpolation and other methods until the size of the tiles is the same as that of the original single tile, and processing all remote sensing image tiles on the layer to obtain the kth-1 layer remote sensing image. The above steps are repeated to iteratively generate the k-2, k-3 and other layer images until the number of the layer images is small enough (within 20 sheets for example) or the layer images have only one row or only one column. And after the generation of the remote sensing images of each layer is finished, taking the lowest layer image as the 0 th layer, and numbering each layer again.

Step 2.2: sequentially inputting the remote sensing images of each level and the corresponding level information into a trained preliminary generation algorithm model, and generating and storing a corresponding preliminary network map; .

The specific method comprises the following steps:

and creating a network model according to the preliminarily generated algorithm model structure and parameters stored in the training stage, inputting the remote sensing image and the level information into the model, predicting the model through a first semantic extraction module and a first map drawing module respectively, and automatically storing a preliminarily network map finally generated by the first map drawing module, wherein the network map is in an RGB image format, and the size of the network map is consistent with that of the input remote sensing image tile. The preliminary network map generation formula is as follows:

y′＝G _φ (x,F _θ (x),k) (6)

Step 2.3: based on each level of remote sensing image and the preliminary network map, using a trained map improvement algorithm model to sequentially generate each level of fine network map from high to low;

for a multi-level remote sensing image data set containing K levels, the data levels are all integers in {0,1, …, K-1} numbered. First, we take the preliminary network map of the K-1 layer as the refined network map of the layer. And then, establishing a network model according to a map improvement algorithm model structure and parameters saved in a training stage, taking K as K-1, respectively inputting the K-1 layer remote sensing image, the K-1 layer preliminary network map and the K layer refined network map into the model for operation, and generating and storing a corresponding K-1 layer refined network map. And (4) sequentially taking all integers with K being {1,2, …, K-2} from high to low, repeating the above process for each K value, and finishing the generation of the network map of all the levels.

The generation formula of the refined network map is as follows:

where φ 'is a parameter of the generator module G, x, y, y' e.g. R ^C×H×W X is a remote sensing image, y is a real network map, y' is a preliminary network map, C, H, W respectively represents the number, height and width of image channels, subscripts k, k-1 represents the zoom level of the map,

the representation generator generates a layer k-1 refinement network map,

the refined network map representing the K layer is spliced and downsampled to form an image with the same size as the remote sensing image,

y′ _k-1 and x _k-1 The actual area represented is the same in position and size. F' _θ′ (x _k-1 ) And the mask and the feature map output by the second semantic extraction module.

Step 2.4: and after generating each layer of network map block by block, splicing the generated network maps according to the sequence number to obtain a complete multi-layer network map. The method can generate a multi-level network map on any scale.

Claims

1. A multi-level network map generation method based on remote sensing images is characterized by comprising the following steps:

step 1: a training stage;

step 1.1: clustering pixel color values of a network map in a training data set of remote sensing image-network map pairing, and solving a ground feature type mask corresponding to the network map;

step 1.2: the method comprises the following steps of training a preliminary generation algorithm model by mixedly using each level remote sensing image, corresponding level information, a corresponding real ground object class mask and a corresponding real network map, and comprises the following steps:

randomly selecting a remote sensing image from a training data set, normalizing the corresponding level number by dividing the corresponding level number by a total level number K, and inputting the remote sensing image and normalized level information into a preliminary generation algorithm model; the model outputs a prediction result of a ground object type mask and a prediction result of a network map;

the size of the prediction result of the ground object type mask is consistent with that of the input remote sensing image, the solution space of each pixel is all integers in [0, (n-1) ], each integer represents one ground object type, and n is the total number of the ground object types; the prediction result of the network map is a network map picture in an RGB format, and the size of the network map picture is consistent with that of the input remote sensing image; comparing the ground feature type mask prediction result and the network map prediction result output by the model with the real ground feature type mask and the real network map respectively, calculating a loss function, reversely propagating a loss value, and updating parameters in the preliminarily generated algorithm model; continuously repeating the process until the set iteration times are met, and storing the structure and the model parameters of the network to obtain the trained preliminary generation algorithm model structure and parameters;

the preliminary generated algorithm model includes two modules: the first semantic extraction module and the first map drawing module;

when the remote sensing image is input into a preliminary generation algorithm model, firstly, the remote sensing image passes through a first semantic extraction module which is a full convolution network; then, a first map drawing module receives output information of the first semantic extraction module, an original remote sensing image and corresponding hierarchy information at the same time, and generates a preliminary network map in an RGB format, wherein the hierarchy information is a vector which is obtained by filling a hierarchy number of the remote sensing image after normalization by dividing the hierarchy number by a total hierarchy number N and is 1 multiplied by H multiplied by W, and H and W are respectively the height and the width of the input remote sensing image;

the first mapping module is a conditional-based generation countermeasure network which performs supervised learning by using the result truth value of the target domain, and comprises a generator and a discriminator, wherein the generator and the discriminator perform countermeasure training: the generator generates synthetic data according to given conditions, and the discriminator distinguishes the generated data and the real data of the generator; in the process, the discriminator is used as a loss function obtained through image data learning to guide the generator to generate an image;

step 1.3: respectively inputting all remote sensing images and corresponding level numbers in the training set into a trained preliminary generation algorithm model, and generating a preliminary network map of each level for storage and standby;

step 1.4: the method sequentially utilizes the remote sensing images of all levels from high to low, the preliminary network map, the real ground feature class mask and the real network map to train the map improvement algorithm model, and comprises the following steps:

for a multi-level network map training data set containing N levels, the data levels are numbered as all integers in {0,1, …, K-1 }; firstly, taking a preliminary network map of a K-1 layer as a refined network map of the layer; then, taking K as K-1, randomly selecting a K-1 layer remote sensing image, a K-1 layer preliminary network map corresponding to the K-1 layer remote sensing image and 4 pieces of fine-finished network maps corresponding to the K layer remote sensing image in an input map improvement algorithm model, generating a K-1 layer ground object type mask prediction result and a K-1 layer network map prediction result, respectively comparing the K-1 layer ground object type mask prediction result with a real ground object type mask and the real network map, calculating a loss function and updating parameters in the map improvement algorithm model according to the loss function;

repeating the previous step until the set iteration times are met, and generating and storing a corresponding k-1 layer fine trimming network map for all the remote sensing images of the k-1 layer by using a current map improvement algorithm model; then, sequentially taking K as all integers in {1,2, …, K-2} from large to small; after repeating the training process for each k value, completing the training of the map improvement algorithm model;

the map improvement algorithm model comprises a second semantic extraction module and a second map drawing module;

when the k-1 layer remote sensing image is input into a map improvement algorithm model, the map improvement algorithm model firstly passes through a second semantic extraction module which is a full convolution network;

then, a second map drawing module receives output information of a second semantic extraction module, a k-1 layer preliminary network map corresponding to the remote sensing image and a refined network map corresponding to the k layer of the preliminary network map at the same time, and generates a refined network map corresponding to the remote sensing image;

the second map drawing module is a condition-based generation confrontation network which uses the result truth value of the target domain for supervised learning and comprises a generator and a discriminator, wherein the generator and the discriminator are trained in confrontation: the generator generates synthetic data according to given conditions, and the discriminator distinguishes the generated data and the real data of the generator; in the process, the discriminator is used as a loss function obtained through image data learning to guide the generator to generate an image; through mutual game learning of the generator and the discriminator, the generator can finally generate generated data meeting the quality requirement;

step 2: a use stage;

step 2.1: if the acquired remote sensing image is a single-level image, expanding the acquired remote sensing image into a multi-level image:

step 2.2: sequentially inputting the remote sensing images of each level and the corresponding level information into a trained preliminary generation algorithm model, and generating and storing a corresponding preliminary network map;

step 2.3: based on each level of remote sensing image and the preliminary network map, using a trained map improvement algorithm model to sequentially generate each level of refined network map from high to low;

step 2.4: and splicing the generated each layer of fine modification network maps into a multi-layer network map according to the number.

2. The method for generating the multi-level network map based on the remote sensing image as claimed in claim 1, wherein in the training stage, the specific implementation method for clustering the pixel color values of the network map in the training data set of the remote sensing image-network map pairing to obtain the surface feature type mask corresponding to the network map is as follows:

firstly, clustering all pixel points of real network map data in a training set by using a clustering algorithm, solving a class number corresponding to each pixel, and corresponding each class number to an expressed ground feature semantic class;

and then, restoring the semantic categories corresponding to the pixels according to the original positions of the pixels in the network map, generating real ground object category masks corresponding to the real network map one by one and storing the real ground object category masks.

3. The method as claimed in claim 1, wherein the first semantic extraction module performs optimization using a cross entropy loss function, wherein the cross entropy loss function is minimized by the formula:

wherein theta is a model parameter of the first semantic extraction module F, the output of the first semantic extraction module F is a segmentation result and a feature map before the segmentation result, and x belongs to R ^N×H×W Indicating an input remote sensing image, wherein N, H and W respectively represent the number, height and width of image channels; s is formed by R ^C×H×W For the true values of the semantic segmentation, C, H and W represent the number, height and width, s, of channels representing the true values of the semantic segmentation, respectively _i Segmenting the true value, s, of an i-th class of interest object for semantics _i Taking the position of 1 to represent that the point is the interested target of the corresponding category, and taking 0 to represent not; f _θ (x) _i For the semantic extraction module pairA prediction confidence for the i-class object of interest.

4. The method for generating a multi-level network map based on remote sensing images as claimed in claim 1, wherein the first mapping module uses a basis loss function as:

wherein phi is,

Are the parameters of generator G and discriminator D, x, y ∈ R ^C×H×W X is remote sensing image, y is real network map, C, H, W represents image channel number, height and width, p _data (x)、p _data (y) representing data distribution of the remote sensing image and the real network map; k represents the zoom level number to which the remote sensing image belongs, and the number is expanded into level information after being input into a model; f _θ (x) A mask and a feature map output by the first semantic extraction module; e represents the mathematical expectation.

5. The method for generating the multi-level network map based on the remote sensing image as claimed in claim 1, wherein in the training stage, the method for generating the preliminary network map of each level is as follows:

y′＝G _φ (x,F _θ (x),k) (3)

6. The method as claimed in claim 1, wherein the second semantic extraction module performs optimization using a cross entropy loss function, wherein the cross entropy loss function is minimized by the formula:

wherein theta ' is a model parameter of the second semantic extraction module F ', the output of the second semantic extraction module F ' is a segmentation result and a feature map before the segmentation result, and x belongs to R ^N×H×W Indicating input remote sensing images, N, H, W respectively representing the number, height and width of image channels; s is formed by R ^C×H×W For the true value of the semantic segmentation, C, H and W respectively represent the number, height and width of channels, s, of the true value of the semantic segmentation _i Segmenting the true value, s, of an i-th class of interest object for semantics _i Taking the position of 1 to represent that the point is the interested target of the corresponding category, and taking 0 to represent not; f' _θ′ (x) _i And predicting confidence of the second semantic extraction module on the ith type of interest target.

7. The method for generating a multi-level network map based on remote sensing images as claimed in claim 1, wherein the second mapping module uses a basis loss function as follows:

wherein phi' is,

y′ _k-1 、y _k-1 and x _k-1 The actual geographical areas represented are of the same location and size; f' _θ′ (x _k-1 ) The mask and the feature map output by the second semantic extraction module; e denotes the mathematical expectation.

8. The method for generating the multi-level network map based on the remote sensing image as claimed in claim 1, wherein in the using stage, the method for expanding the collected single-level remote sensing image into the multi-level image is as follows:

the collected single-layer remote sensing image is regarded as the kth layer, all tiles are numbered, every adjacent 2 x 2 remote sensing image tiles are spliced and down-sampled to the size which is the same as that of the original single tile by an interpolation method, and all remote sensing image tiles of the layer are processed to obtain the kth-1 layer remote sensing image;

repeating the steps, and iteratively generating each layer of image until the layer of image has only one row or only one column; and after the generation of the remote sensing images of each layer is finished, taking the lowest layer image as the 0 th layer, and numbering each layer again.

9. The method for generating a multi-level network map based on remote sensing images as claimed in claim 1, wherein in the using stage, the specific method for generating the corresponding preliminary network map is as follows:

y′＝G _φ (x,F _θ (x),k) (6)

wherein y' is a preliminary network map, x is a remote sensing map, k represents a zoom level number to which the remote sensing image belongs, the number is expanded into level information after being input into a model, and F _θ (x) Representing a first semantic extraction ModuleAnd G represents a generator module, and phi represents generator module parameters.

10. The method for generating the multi-level network map based on the remote sensing image as claimed in claim 1, wherein in the using stage, based on the multi-level remote sensing image and the preliminary network map, the method for sequentially generating the multi-level fine network map from top to bottom by using the trained map improvement algorithm model comprises the following steps:

for a multi-level remote sensing image data set containing K levels, the data levels are numbered as all integers in {0,1, …, K-1 }; firstly, taking a preliminary network map of a K-1 layer as a refined network map of the layer; then, a network model is established according to a map improvement algorithm model structure and parameters saved in a training stage, K is taken as K-1, the K-1 layer remote sensing image, the K-1 layer preliminary network map and the K layer fine network map are respectively input into the model for operation, and a corresponding K-1 layer fine network map is generated and stored; sequentially taking all integers with K being {1,2, …, K-2} from large to small, repeating the above process for each K value, and completing the generation of all-level fine network maps;

the generation formula of the refined network map is as follows:

the k-1 level refinement network map generated by the representation generator,

denotes the k-th layerThe refined network map is spliced and downsampled to form an image with the same size as the remote sensing image,

y′ _k-1 and x _k-1 The position and the size of the represented actual area are the same; f' _θ′ (x _k-1 ) And the mask and the feature map output by the second semantic extraction module.