Nothing Special   »   [go: up one dir, main page]

CN112434702A - Image processing method, image processing device, computer equipment and storage medium - Google Patents

Image processing method, image processing device, computer equipment and storage medium Download PDF

Info

Publication number
CN112434702A
CN112434702A CN201910792402.2A CN201910792402A CN112434702A CN 112434702 A CN112434702 A CN 112434702A CN 201910792402 A CN201910792402 A CN 201910792402A CN 112434702 A CN112434702 A CN 112434702A
Authority
CN
China
Prior art keywords
image
data
coding
decoding
feature data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910792402.2A
Other languages
Chinese (zh)
Inventor
陈伟涛
王洪彬
李�昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201910792402.2A priority Critical patent/CN112434702A/en
Publication of CN112434702A publication Critical patent/CN112434702A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The embodiment of the application discloses an image processing method and device. The method comprises the following steps: respectively inputting the first image and the second image into corresponding coding networks for feature extraction to obtain feature data of different coding levels, then merging the feature data of the same coding level to obtain target feature data of different coding levels, sequentially performing superposition operation and decoding operation on target feature data of different coding levels to obtain decoded data, according to the decoded data, determining the difference data between the first image and the second image, so that the two input images can be better distinguished during encoding, and the features of a plurality of encoding layers are fused during decoding, thereby not only utilizing the more abstract high-level features, but also utilizing the more image-like low-level features, and the characteristics of each intermediate level between the two images, the difference can be efficiently extracted from the two images, and the accuracy and efficiency of finding the difference between the images are improved.

Description

Image processing method, image processing device, computer equipment and storage medium
Technical Field
The present application relates to the field of image processing technologies, and in particular, to an image processing method, an image processing apparatus, a computer device, and a computer-readable storage medium.
Background
With the development of computer technology, computer detection has begun to be applied to the comparison task of remote sensing images in two phases, for example, newly added building detection is the most common one, and is mostly used in homeland management law enforcement.
Compared with the newly added building detection, the range of the dynamic soil construction detection is wider. The method not only comprises detection of newly added buildings, but also comprises detection of construction demolition, building reconstruction, construction site change of agricultural land, forest land and soil movement, newly added roads, coastline change and the like. The application scenes are also expanded from single territorial management law enforcement to various scenes such as ecological environment management, disaster degree evaluation, marine offshore area management and the like. Therefore, the method has great significance for solving the problem of how to independently extract the target soil movement change from a plurality of changes in the complex two-stage remote sensing image.
The applicant finds that most of the change areas need to be extracted manually, so that the problem of more missed extraction exists, the efficiency is low, and the general computer detection method is not really applied to actual scenes due to the fact that the number of the found and paired change areas is not ideal.
Disclosure of Invention
In view of the above, the present application is made to provide an image processing method, a computer apparatus, and a computer-readable storage medium that overcome or at least partially solve the above problems.
According to an aspect of the present application, there is provided an image processing method including:
respectively inputting the first image and the second image into corresponding coding networks for feature extraction to obtain feature data of different coding levels;
merging the feature data of the same coding level to obtain target feature data of different coding levels;
sequentially performing superposition operation and decoding operation on the target characteristic data of different coding levels to obtain decoded data;
determining difference data between the first image and the second image according to the decoded data.
Optionally, before the first image and the second image are respectively input to corresponding coding networks for feature extraction, so as to obtain feature data of different coding levels, the method further includes:
receiving two remote sensing images of the same area at different times as the first image and the second image.
Optionally, the respectively inputting the first image and the second image into corresponding coding networks for feature extraction, and obtaining feature data of different coding levels includes:
extracting the characteristics of the first image and the second image in a corresponding coding network to obtain characteristic data of a 1 st coding level;
and performing feature extraction on the feature data of the (N-1) th coding level in the coding network to obtain the feature data of the Nth coding level.
Optionally, the sequentially performing a superposition operation and a decoding operation on the target feature data of the different encoding levels to obtain decoded data includes:
decoding the target characteristic data of the Nth coding level to obtain Nth decoding data;
performing superposition operation and decoding operation on the target characteristic data of the (N-1) th coding level and the Nth decoding data to obtain N-1 th decoding data;
and iteratively executing the superposition operation and the decoding operation until the superposition operation and the decoding operation of the target characteristic data of each coding level are sequentially completed.
Optionally, the performing an overlay operation and a decoding operation on the target feature data of the N-1 th encoding layer and the nth decoded data to obtain the N-1 th decoded data includes:
performing superposition operation on the target characteristic data of the (N-1) th coding level and the Nth decoding data to obtain a superposition result;
and sequentially performing decoding operations on the superposition result through a first convolution layer, a first deconvolution layer and a second convolution layer to obtain the N-1 decoding data, wherein the number of output channels of the first convolution layer is one fourth of the number of input channels, the number of input channels and the number of output channels of the first deconvolution layer are the same, and the number of output channels of the second convolution layer is the same as the number of channels of the target characteristic data of the N-2 coding level.
Optionally, after the iteratively performing the superposition operation and the decoding operation until the superposition operation and the decoding operation of the target feature data of each coding level are sequentially completed, the sequentially performing the superposition operation and the decoding operation on the target feature data of different coding levels to obtain decoded data further includes:
and sequentially carrying out decoding operations on the 1 st decoded data by a second deconvolution layer, a third convolution layer and a fourth convolution layer to obtain the decoded data, wherein the number of output channels of the second deconvolution layer is half of the number of input channels, the number of input channels and the number of output channels of the third convolution layer are the same, and the number of output channels of the fourth convolution layer is 1.
Optionally, the merging the feature data of the same coding level to obtain the target feature data of different coding levels includes:
and merging channels of the characteristic data to obtain the target characteristic data.
Optionally, the determining, from the decoded data, difference data between the first image and the second image comprises:
determining difference degree data at different positions between the first image and the second image according to the decoding data;
and determining difference data between the first image and the second image according to the position information corresponding to the difference degree data meeting the preset requirement.
Optionally, before the first image and the second image are respectively input to corresponding coding networks for feature extraction, so as to obtain feature data of different coding levels, the method further includes:
a neural network for image processing is trained using the first image samples and the second image samples, the neural network including an encoding network and a decoding network.
In accordance with another aspect of the present application, there is provided an image processing method including:
receiving a first remote sensing image and a second remote sensing image which aim at the same area and are different in time;
respectively inputting the first remote sensing image and the second remote sensing image into corresponding coding networks for feature extraction to obtain feature data of different coding levels;
merging the feature data of the same coding level to obtain target feature data of different coding levels;
sequentially performing superposition operation and decoding operation on the target characteristic data of different coding levels to obtain decoded data;
and determining a target area with difference between the first remote sensing image and the second remote sensing image according to the decoded data.
In accordance with another aspect of the present application, there is provided an image processing apparatus including a neural network including a first encoding network, a second encoding network, a decoding network, and a difference determination module;
the first encoding network is configured to: extracting the characteristics of the first image to obtain characteristic data of different coding levels;
the second encoding network is configured to: extracting the features of the second image to obtain feature data of different coding levels;
the decoding network is configured to: merging the feature data of the same coding level to obtain target feature data of different coding levels; sequentially performing superposition operation and decoding operation on the target characteristic data of different coding levels to obtain decoded data;
the discrepancy determining module is to: determining difference data between the first image and the second image according to the decoded data.
According to the embodiment of the application, the first image and the second image are respectively input into the corresponding coding networks for feature extraction to obtain feature data of different coding levels, then merging the feature data of the same coding level to obtain target feature data of different coding levels, sequentially performing superposition operation and decoding operation on target feature data of different coding levels to obtain decoded data, according to the decoded data, determining the difference data between the first image and the second image, so that the two input images can be better distinguished during encoding, and the features of a plurality of encoding layers are fused during decoding, thereby not only utilizing the more abstract high-level features, but also utilizing the more image-like low-level features, and the characteristics of each intermediate level between the two images, the difference can be efficiently extracted from the two images, and the accuracy and efficiency of finding the difference between the images are improved.
The foregoing description is only an overview of the technical solutions of the present application, and the present application can be implemented according to the content of the description in order to make the technical means of the present application more clearly understood, and the following detailed description of the present application is given in order to make the above and other objects, features, and advantages of the present application more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 shows a schematic diagram of an image processing process;
FIG. 2 is a flow chart of an embodiment of an image processing method according to a first embodiment of the present application;
FIG. 3 shows a schematic diagram of an image processing architecture;
FIG. 4 is a flow chart of an embodiment of an image processing method according to the second embodiment of the present application;
FIG. 5 is a flow chart of an embodiment of an image processing method according to the third embodiment of the present application;
FIG. 6 is a block diagram of an embodiment of an image processing apparatus according to a fourth embodiment of the present application;
FIG. 7 is a block diagram of an embodiment of an image processing apparatus according to the fifth embodiment of the present application;
FIG. 8 is a block diagram of an embodiment of an image processing apparatus according to the sixth embodiment of the present application;
fig. 9 illustrates an exemplary system that can be used to implement various embodiments described in this disclosure.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
To enable those skilled in the art to better understand the present application, the following description is made of the concepts related to the present application:
the first image and the second image are two images for which it is necessary to determine whether there is a difference between the two, or the size of the difference, or a region having a difference, or the like. For example, in an application scene of remote sensing dynamic soil change detection, two satellite remote sensing images in two years of a certain area are used as a first image and a second image to realize comparison of buildings, detection of illegal buildings and the like; or in an application scenario of medical treatment or research, two infrared photographs or X-ray photographs of a patient at different periods are taken as the first image and the second image to implement diagnosis of a disease, or any other suitable first image and second image may be included, which is not limited in this application.
The difference between the two images can be characterized using difference data. The difference data includes whether there is a difference, or the degree of the difference, the area with the difference, or any other applicable difference data, which is not limited in the embodiments of the present application. For example, two satellite remote sensing images in two years of a certain area are acquired at different times, and for various reasons, each pixel point between the two remote sensing images may be more or less different, but only when the difference in a region is large or the difference indicates that there is a soil movement change, the region needs to be identified as a region with the difference.
The present application proposes an algorithmic framework for determining difference data between two images, the framework inputting two original images corresponding to two coding networks. The encoding network can automatically encode the original input to generate feature data of a plurality of encoding levels, and the encoding network can encode the input and also can be regarded as extracting features of the input. For example, a back propagation algorithm is used to train the network so that feature data for multiple encoding levels of the encoding network output may represent the input.
The method comprises the steps of performing feature extraction on an input first image and an input second image to obtain more image-specific feature data, and performing feature extraction on the obtained more image-specific feature data to obtain more abstract feature data, namely, the more abstract feature data is obtained by encoding on the basis of the more image-specific feature data. Thus, the feature data may be distinguished by different encoding levels, wherein more concrete feature data corresponds to lower encoding levels and more abstract feature data corresponds to higher encoding levels. The coding network may generate feature data of at least two coding levels, which is not limited in the embodiments of the present application.
For example, the two coding networks are twin networks, i.e., two networks with identical network structures and identical weights. One network in the twin network processes the remote sensing image (namely, the first image) in the previous period and extracts the characteristic data of the image in the previous period, the other network processes the remote sensing image (namely, the second image) in the later period and extracts the characteristic data of the image in the later period, and the weights of the two networks are completely shared. Each network is divided into several modules, the output of the previous module being directly connected to the input of the next module. And performing feature extraction on the first image to obtain feature data of the 1 st coding level of the first image, and performing feature extraction on the feature data of the (N-1) th coding level of the first image to obtain feature data of the Nth coding level of the first image. And performing feature extraction on the second image to obtain feature data of the 1 st coding level of the second image, and performing feature extraction on the feature data of the (N-1) th coding level of the second image to obtain feature data of the Nth coding level of the second image.
And each coding level corresponds to the characteristic data of the first image and the characteristic data of the second image, the characteristic data of the two images in the same coding level are merged, and the obtained characteristic data is marked as target characteristic data. The merging of the feature data may be merging in channel dimension. The feature data corresponding to different coding levels can generate target feature data corresponding to the coding levels.
In order to fuse feature data of a plurality of coding levels to determine difference data using feature data to each coding level, it is necessary to sequentially perform a superposition operation and a decoding operation on target feature data of different coding levels. The decoding operation is an operation corresponding to the encoding process, and is an operation of reducing the abstract feature data to the original data as much as possible, for example, an operation of decoding the feature map through a process such as convolution, deconvolution, or convolution, or any other suitable decoding operation, which is not limited in this embodiment of the present application. The overlay operation is a way of fusing features, for example, an operation of adding two feature maps of the same size, or any other suitable overlay operation, which is not limited in this embodiment of the present application. And sequentially performing superposition operation and decoding operation on the target feature data of different coding levels to obtain data which is marked as decoding data.
In an alternative embodiment of the present application, the first image and the second image may be remote sensing images, which include, but are not limited to, films or photographs recording electromagnetic waves of various surface features, and are mainly classified into aerial photographs and satellite photographs, for example, two remote sensing images obtained by satellite remote sensing technology at different times for the same region on the ground may be used as the first image and the second image.
In an optional embodiment of the present application, feature extraction is performed on the first image and the second image in a corresponding coding network, and the obtained feature data is recorded as feature data of the 1 st coding level. And performing feature extraction on the feature data of the (N-1) th coding level of the first image and the second image, and recording the obtained feature data as the feature data of the (N) th coding level.
In an optional embodiment of the present application, a decoding operation is performed on target feature data of an nth coding level, and the obtained data is denoted as nth decoded data. And performing superposition operation and decoding operation on the target characteristic data of the (N-1) th coding level and the N-th decoded data to obtain data, marking the obtained data as the (N-1) th decoded data, and performing superposition operation and decoding operation on the target characteristic data of the (N-2) th coding level and the N-1 th decoded data to obtain data, marking the obtained data as the (N-2) th decoded data.
In an alternative embodiment of the present application, the decoding operation requires passing through a convolutional layer, a reverse convolutional layer, etc. The Convolutional layer is a structure in a Convolutional neural network, each Convolutional layer (Convolutional layer) in the Convolutional neural network is composed of a plurality of Convolutional units, and parameters of each Convolutional unit are obtained through optimization of a back propagation algorithm. The convolution operation aims to extract different input features, the first layer of convolution layer can only extract some low-level features such as edges, lines, angles and other levels, and more layers of networks can iteratively extract more complex features from the low-level features. The deconvolution layer corresponds to the Convolutional layer and is used to visualize a trained Convolutional neural network, also called a Transposed Convolutional layer (Transposed Convolutional layer).
In an alternative embodiment of the present application, in order to determine difference data between the first image and the second image, it is necessary to obtain difference degree data at each corresponding position between the first image and the second image according to the decoded data. The difference degree data is used for representing the difference degree between the two images at each position, and the magnitude of the numerical value is related to the magnitude of the difference. Each difference degree data corresponds to each position on the image, and information characterizing the position may be denoted as position information, e.g. representing each position on the image in a coordinate manner, which may be used to determine the area with the difference between the first image and the second image.
In an alternative embodiment of the present application, a neural network for image processing of the present application may be trained using a first image sample and a second image sample. The neural network includes an encoding network and a decoding network. The decoding network is configured to merge feature data of the same coding level, sequentially perform a superposition operation and a decoding operation on target feature data of different coding levels, and the like, and may be specifically used for any other applicable operation, which is not limited in this embodiment of the present application.
In an optional embodiment of the present application, in order to satisfy the comparison between images in different scenes, a neural network may be formed by a plurality of dual coding networks, for example, a first image and a second image are respectively input into corresponding first coding networks, a third image and a fourth image are respectively input into corresponding second coding networks, first difference data between the first image and the second image and second difference data between the third image and the fourth image are respectively obtained, and then the first difference data and the second difference data are formed together into a comparison report, so that a more refined or accurate comparison result is achieved. According to an embodiment of the application, in the soil movement change detection, most of the change areas need to be extracted manually, so that the problems of more missed extraction and low efficiency exist. As shown in fig. 1, a schematic diagram of an image processing process is provided, in which a first image and a second image are respectively input to corresponding coding networks for feature extraction to obtain feature data of different coding levels, then the feature data of the same coding level are combined to obtain target feature data of different coding levels, the target feature data of different coding levels are sequentially subjected to superposition operation and decoding operation to obtain decoded data, difference data between the first image and the second image is determined according to the decoded data, so that two input images can be better distinguished during encoding, features of multiple coding levels are fused during decoding, not only abstract high-level features but also more similar low-level features and features of intermediate levels between the two can be utilized, the method and the device realize the efficient extraction of the difference from the two images, and improve the accuracy and efficiency of finding the difference between the images. The present application is applicable to, but not limited to, the above application scenarios.
Referring to fig. 2, a flowchart of an embodiment of an image processing method according to a first embodiment of the present application is shown, where the method may specifically include the following steps:
step 101, inputting the first image and the second image into corresponding coding networks respectively to perform feature extraction, so as to obtain feature data of different coding levels.
In an embodiment of the application, in order to determine difference data between two images, the two inputs correspond to two coding networks, the first image being input to one coding network and the second image being input to the other coding network. And (5) carrying out feature extraction on the coding network to obtain feature data.
For example, as shown in the schematic diagram of the image processing architecture shown in fig. 3, a previous remote sensing image (i.e., a first image) and a subsequent remote sensing image (i.e., a second image) are input into corresponding twin network-based encoding networks, and feature extraction is performed on an original image after certain processing. Each coding network is divided into 4 coding modules, the former coding module is directly connected with the latter coding module, each coding network can extract feature maps (namely feature data) of 4 coding levels of an image, and the final coding part outputs 8 feature maps of different levels to enter a decoding network, wherein two networks of the 4 th coding module respectively output the feature maps n1And a characteristic diagram n2
And 102, merging the feature data of the same coding level to obtain target feature data of different coding levels.
In the embodiment of the present application, feature data of each coding level is merged first, that is, feature data of a first image and feature data of a second image of the same coding level are merged to obtain target feature data of the coding level.
For example, as shown in FIG. 3, starting from the 4 th encoding module, it outputs a characteristic map n1And featuresGraph n2First, merging in channel dimension with the result of merging n12(namely the target characteristic data of the 4 th coding level), and the characteristic graph nl output by the 3 rd coding module1And a characteristic diagram nl2Are combined in channel dimension, and the combined result is nl12(i.e., the target feature data of the 3 rd encoding level), and so on.
And 103, sequentially performing superposition operation and decoding operation on the target characteristic data of different coding layers to obtain decoded data.
In the embodiment of the present application, in order to fuse target feature data of different coding levels, rather than performing decoding operation only on the target feature data, it is necessary to perform decoding operation after performing superposition operation on the target feature data of one coding level and target feature data of a subsequent coding level. Then for the target feature data of the last encoding level, since there is no target feature data of the next encoding level, the decoding operation is performed first. And sequentially performing superposition operation and decoding operation on the target characteristic data of each coding layer to obtain decoded data.
For example, as shown in FIG. 3, merge result n12(namely the target characteristic data of the 4 th coding level) is decoded by the basic decoding module to output dn12. Then to the feature map nl output by the 3 rd encoding module1And a characteristic diagram nl2Merging the results nl12(i.e., the target feature data of the 3 rd encoding level), the merging result nl12And dn12And adding, inputting the added result into the next basic decoding module for decoding operation, adding the added result with the target characteristic data of the 2 nd coding level, inputting the added result into the next basic decoding module for decoding operation, adding the added result with the target characteristic data of the 1 st coding level, inputting the added result into the next basic decoding module for decoding operation, and finally obtaining the decoded data.
Step 104, determining difference data between the first image and the second image according to the decoded data.
In the embodiment of the present application, the implementation manner of determining the difference data between the first image and the second image according to the decoded data may include multiple manners, for example, determining difference degree data at different positions between the first image and the second image according to the decoded data, determining the difference data between the first image and the second image according to position information corresponding to the difference degree data meeting a preset requirement, performing visualization according to the difference data, and displaying an area having a difference between the first image and the second image. Any suitable implementation may be specifically included, and the embodiments of the present application do not limit this.
For example, as shown in fig. 3, the decoded data is subjected to a sigmoid (logic) activation function to obtain difference degree data at each position between the first image and the second image, the difference data is obtained according to whether the difference degree data is greater than a preset threshold, and visualization is performed according to the difference data, where the rightmost image in the diagram shows an area having a difference between the first image and the second image.
According to the embodiment of the application, the first image and the second image are respectively input into the corresponding coding networks for feature extraction to obtain feature data of different coding levels, then merging the feature data of the same coding level to obtain target feature data of different coding levels, sequentially performing superposition operation and decoding operation on target feature data of different coding levels to obtain decoded data, according to the decoded data, determining the difference data between the first image and the second image, so that the two input images can be better distinguished during encoding, and the features of a plurality of encoding layers are fused during decoding, thereby not only utilizing the more abstract high-level features, but also utilizing the more image-like low-level features, and the characteristics of each intermediate level between the two images, the difference can be efficiently extracted from the two images, and the accuracy and efficiency of finding the difference between the images are improved.
Referring to fig. 4, a flowchart of an embodiment of an image processing method according to the second embodiment of the present application is shown, where the method specifically includes the following steps:
step 201, training a neural network for image processing by using a first image sample and a second image sample, wherein the neural network comprises an encoding network and a decoding network.
In the embodiment of the present application, the neural network for image processing includes an encoding network and a decoding network, and the neural network needs to be trained by first using the first image sample and the second image sample. The training process may include continuously providing two image samples to the neural network, passing a first image of the image pair through the network, passing a second image of the image pair through the network, calculating a loss value using the two images, then back-propagating the loss calculation gradient, updating weights in the neural network until a desired performance is achieved, and finally obtaining a neural network comprising an encoding network and a decoding network.
For example, the neural network proposed in the present application may use a certain region on an image in two years provided by a certain unit for training, and after training, use a region completely not overlapping with the training region as a test region for effect comparison, the first method does not use a twin network, and directly combines images in the preceding and following periods into 6 channels of data input, and the number of input channels corresponding to the first convolutional layer is also changed from 3 to 6, and the second method uses a twin network, but only uses the feature data of the highest layer, and the method proposed herein uses a twin network, and uses the feature data of each coding layer. The evaluation index was IOU (Intersection over Union), and the results are shown in the following table:
method of producing a composite material First method Second method The method is presented herein
IOU 0.6 0.64 0.69
Step 202, receiving two remote sensing images of the same area at different times as the first image and the second image.
In the embodiment of the application, in order to obtain the moving soil change condition of the same region on the ground, two remote sensing images are required to be obtained at different times for the same region, and the two remote sensing images are received and input into a neural network as a first image and a second image. For example, remote sensing images for two years of the same area are received, the time interval of the two remote sensing images being two years.
Step 203, extracting the features of the first image and the second image in the corresponding coding network to obtain the feature data of the 1 st coding level.
In the embodiment of the application, after the first image and the second image are input into the corresponding coding networks, the coding networks perform feature extraction, and first obtain feature data of the 1 st coding level.
And 204, performing feature extraction on the feature data of the (N-1) th coding level in the coding network to obtain the feature data of the Nth coding level.
In the embodiment of the application, the coding network performs feature extraction on the feature data of the 1 st coding level to obtain the feature data of the 2 nd coding level, then performs feature extraction on the feature data of the 2 nd coding level to obtain the feature data of the 3 rd coding level, and performs feature extraction on the feature data of the (N-1) th coding level in the coding network according to rules in sequence to obtain the feature data of the Nth coding level.
For example, as shown in the schematic diagram of the image processing architecture shown in fig. 3, a previous remote sensing image (i.e., a first image) and a subsequent remote sensing image (i.e., a second image) are input into corresponding twin network-based encoding networks, and feature extraction is performed on an original image after certain processing. Each coding network can extract feature maps (i.e., feature data) for 4 coding levels of an image.
And step 205, merging the channels of the feature data to obtain the target feature data.
In the embodiment of the application, merging channels is performed on the feature data of each coding level to obtain target feature data of each coding level.
For example, as shown in FIG. 3, the feature map n from the 4 th coding level1And a characteristic diagram n2Initially, first merge in channel dimension with a result of n12(i.e. target feature data of the 4 th coding level), then feature map nl of the 3 rd coding level1And a characteristic diagram nl2Merging in channel dimension, and the merging result is nl12(i.e., the target feature data of the 3 rd encoding level), and so on.
And step 206, performing decoding operation on the target characteristic data of the Nth coding level to obtain Nth decoding data.
In the embodiment of the present application, starting from the target feature data of the nth coding hierarchy, the target feature data is decoded first, and the decoded data is denoted as nth decoded data.
For example, as shown in FIG. 3, the merging result n of the 4 th coding level is first obtained12And inputting the basic decoding module, and decrypting to obtain the Nth decoding data.
And step 207, performing superposition operation and decoding operation on the target feature data of the (N-1) th coding level and the Nth decoding data to obtain the (N-1) th decoding data.
In the embodiment of the application, the target characteristic data of the (N-1) th coding level and the N-th decoding data are subjected to superposition operation, then the superposition result is subjected to decoding operation to obtain the (N-1) th decoding data, and the like until the 1 st decoding data is obtained.
In this embodiment of the present application, optionally, performing an overlay operation and a decoding operation on the target feature data of the N-1 th encoding layer and the nth decoded data to obtain an implementation manner of the N-1 th decoded data may include: performing superposition operation on the target characteristic data of the (N-1) th coding level and the Nth decoding data to obtain a superposition result; and sequentially carrying out decoding operations on the superposition result through the first convolutional layer, the first anti-convolutional layer and the second convolutional layer to obtain the N-1 decoding data.
The number of output channels of the first convolution layer is one fourth of the number of input channels, the number of input channels and the number of output channels of the first deconvolution layer are the same, and the number of output channels of the second convolution layer is the same as the number of channels of the target feature data of the (N-2) th coding level.
For example, as shown in FIG. 3, the 4 th coding level merge result n12Inputting a basic decoding module, wherein the first layer of the basic decoding module is a convolution layer (namely a first convolution layer) with convolution kernel of 1 × 1, and the input channel number is a feature map n1Number of channels and profile n2The number of output channels is one fourth of the number of input channels, the second layer is a deconvolution layer (i.e. a first deconvolution layer) with the kernel size of 3x3, the number of input and output channels is the same as that of the output channels of the first layer, the third layer is a convolution layer (i.e. a second convolution layer) with the convolution kernel of 1x1, the number of input channels is the same as that of the output channels of the second layer, the number of output channels is the sum of the number of channels of two feature maps of the 3 rd coding layer, and each layer is activated by a Rectised Linear Unit (Linear rectifier function) activation function. The basic decoding module outputs decoding result 4 th decoding data dn12Then, the data is combined with the target characteristic data of the 3 rd coding level, namely the combination result nl12Adding, inputting the added result (i.e. superposition result) into the next basic decoding module, outputting the decoding result 3 rd decoding data, and repeating the above steps until 1 st decoding data is obtained.
And step 208, iteratively executing the superposition operation and the decoding operation until the superposition operation and the decoding operation of the target feature data of each coding level are sequentially completed.
In the embodiment of the present application, the superposition operation and the decoding operation in step 207 are iteratively executed until the superposition operation and the decoding operation of the target feature data of each coding level are sequentially completed, and finally the obtained decoded data fuses the features of each coding level.
Step 209, sequentially subjecting the 1 st decoded data to decoding operations of a second deconvolution layer, a third convolution layer and a fourth convolution layer to obtain the decoded data, wherein the number of output channels of the second deconvolution layer is half of the number of input channels, the number of input channels and the number of output channels of the third convolution layer are the same, and the number of output channels of the fourth convolution layer is 1.
In this embodiment, the 1 st decoded data may further continue to be decoded, and the 1 st decoded data sequentially passes through the second deconvolution layer, the third convolution layer, and the fourth convolution layer to be decoded, so as to obtain decoded data.
For example, as shown in fig. 3, the number of channels of the 1 st decoded data obtained finally is 128, the decoded data is sent to one deconvolution layer whose number of output channels is 64 and core size is 4 × 4, the output of the deconvolution layer is input to the convolution layer whose number of channels and output channels are 64, fill value is 1 and core size is 3 × 3, the output of the convolution layer is input to the convolution layer whose number of input channels is 64, output channel number is 1, fill value is 1 and core size is 3 × 3, and the decoded data is output.
Step 210, determining difference degree data at different positions between the first image and the second image according to the decoded data.
In the embodiment of the application, the decoded data is normalized to be between 0 and 1 through the sigmoid activation function, and difference degree data of each position between the first image and the second image can be obtained.
For example, the decoding data obtained by sequentially performing decoding operations on the second deconvolution layer, the third convolution layer and the fourth convolution layer is normalized to be between 0 and 1 through a sigmoid activation function, the numerical value obtained by normalization can represent difference degree data at each position between the two images, the difference data between the first image and the second image is determined according to position information corresponding to the difference degree data meeting the preset requirement, and visualization is performed to obtain the rightmost image in fig. 3, wherein the image shows the region with difference between the first image and the second image. Any suitable implementation may be specifically included, and the embodiments of the present application do not limit this.
Step 211, determining difference data between the first image and the second image according to the position information corresponding to the difference degree data meeting the preset requirement.
In this embodiment of the application, it is determined whether the difference degree data meets a preset requirement, for example, whether the difference degree data is greater than a preset threshold, and if the difference degree data is greater than the preset threshold, the preset requirement is met, which may specifically include any applicable preset requirement, and this embodiment of the application does not limit this. According to the position information corresponding to the difference degree data meeting the preset requirement, difference data between the first image and the second image can be determined, and the difference data can represent the area with the difference between the first image and the second image and the size of the difference.
According to the embodiment of the application, the first image and the second image are respectively input into the corresponding coding networks for feature extraction to obtain feature data of different coding levels, then merging the feature data of the same coding level to obtain target feature data of different coding levels, sequentially performing superposition operation and decoding operation on target feature data of different coding levels to obtain decoded data, according to the decoded data, determining the difference data between the first image and the second image, so that the two input images can be better distinguished during encoding, and the features of a plurality of encoding layers are fused during decoding, thereby not only utilizing the more abstract high-level features, but also utilizing the more image-like low-level features, and the characteristics of each intermediate level between the two images, the difference can be efficiently extracted from the two images, and the accuracy and efficiency of finding the difference between the images are improved.
Referring to fig. 5, a flowchart of an embodiment of an image processing method according to a third embodiment of the present application is shown, where the method specifically includes the following steps:
step 301, receiving a first remote sensing image and a second remote sensing image of different time aiming at the same area.
In the embodiment of the application, in order to obtain the moving soil change condition of the same region on the ground, two remote sensing images need to be obtained at different times for the same region, and the two remote sensing images are received and recorded as a first remote sensing image and a second remote sensing image. For example, remote sensing images for two years of the same area are received, the time interval of the two remote sensing images being two years.
And 302, respectively inputting the first remote sensing image and the second remote sensing image into corresponding coding networks for feature extraction, and obtaining feature data of different coding levels.
In the embodiment of the present application, a specific implementation manner of this step may refer to the description in the foregoing embodiment, and is not described herein again.
And 303, merging the feature data of the same coding level to obtain target feature data of different coding levels.
In the embodiment of the present application, a specific implementation manner of this step may refer to the description in the foregoing embodiment, and is not described herein again.
And step 304, sequentially performing superposition operation and decoding operation on the target characteristic data of different coding levels to obtain decoded data.
In the embodiment of the present application, a specific implementation manner of this step may refer to the description in the foregoing embodiment, and is not described herein again.
And 305, determining a target area with difference between the first remote sensing image and the second remote sensing image according to the decoded data.
In the embodiment of the present application, a specific implementation manner of this step may refer to the description in the foregoing embodiment, and is not described herein again.
According to the embodiment of the application, the first remote sensing image and the second remote sensing image in different time aiming at the same region are received, the first remote sensing image and the second remote sensing image are respectively input into corresponding coding networks for feature extraction to obtain feature data of different coding levels, the feature data of the same coding level are merged to obtain target feature data of different coding levels, the target feature data of different coding levels are sequentially subjected to superposition operation and decoding operation to obtain decoded data, a target region with difference between the first remote sensing image and the second remote sensing image is determined according to the decoded data, so that the two input remote sensing images can be better distinguished during coding, the features of a plurality of coding levels are fused during decoding, not only abstract high-level features can be utilized, but also more image low-level features can be utilized, and the characteristics of each intermediate level between the two remote sensing images, the difference can be efficiently extracted from the two remote sensing images, and the accuracy and efficiency of finding the areas with the difference between the remote sensing images are improved.
Referring to fig. 6, a block diagram illustrating a structure of an embodiment of an image processing apparatus according to a fourth embodiment of the present application may specifically include:
the image processing apparatus comprises a neural network 400 comprising a first encoding network 4001, a second encoding network 4002, a decoding network 4003 and a disparity determining module 4004;
the first encoding network is configured to: extracting the characteristics of the first image to obtain characteristic data of different coding levels;
the second encoding network is configured to: extracting the features of the second image to obtain feature data of different coding levels;
the decoding network is configured to: merging the feature data of the same coding level to obtain target feature data of different coding levels; sequentially performing superposition operation and decoding operation on the target characteristic data of different coding levels to obtain decoded data;
the discrepancy determining module is to: determining difference data between the first image and the second image according to the decoded data.
According to the embodiment of the application, the first image and the second image are respectively input into the corresponding coding networks for feature extraction to obtain feature data of different coding levels, then merging the feature data of the same coding level to obtain target feature data of different coding levels, sequentially performing superposition operation and decoding operation on target feature data of different coding levels to obtain decoded data, according to the decoded data, determining the difference data between the first image and the second image, so that the two input images can be better distinguished during encoding, and the features of a plurality of encoding layers are fused during decoding, thereby not only utilizing the more abstract high-level features, but also utilizing the more image-like low-level features, and the characteristics of each intermediate level between the two images, the difference can be efficiently extracted from the two images, and the accuracy and efficiency of finding the difference between the images are improved.
Referring to fig. 7, a block diagram illustrating a structure of an embodiment of an image processing apparatus according to the fifth embodiment of the present application may specifically include:
an extraction module 501, configured to input the first image and the second image into corresponding coding networks respectively to perform feature extraction, so as to obtain feature data of different coding levels;
a merging module 502, configured to merge feature data of the same coding level to obtain target feature data of different coding levels;
a decoding module 503, configured to perform superposition operation and decoding operation on the target feature data of different coding layers in sequence to obtain decoded data;
a difference determining module 504 configured to determine difference data between the first image and the second image according to the decoded data.
In this embodiment of the present application, optionally, the apparatus further includes:
and the image receiving module is used for receiving two remote sensing images aiming at the same area at different time as the first image and the second image before the first image and the second image are respectively input into the corresponding coding networks for feature extraction to obtain feature data of different coding levels.
In this embodiment of the application, optionally, the extraction module includes:
the first extraction submodule is used for extracting the characteristics of the first image and the second image in a corresponding coding network to obtain the characteristic data of the 1 st coding level;
and the second extraction submodule is used for extracting the characteristics of the characteristic data of the (N-1) th coding level in the coding network to obtain the characteristic data of the Nth coding level.
In this embodiment of the present application, optionally, the decryption module includes:
the first decoding submodule is used for carrying out decoding operation on the target characteristic data of the Nth coding level to obtain Nth decoding data;
the second decoding submodule is used for carrying out superposition operation and decoding operation on the target characteristic data of the (N-1) th coding level and the Nth decoding data to obtain the (N-1) th decoding data;
and the iteration submodule is used for iteratively executing the superposition operation and the decoding operation until the superposition operation and the decoding operation of the target characteristic data of each coding level are sequentially completed.
In this embodiment of the application, optionally, the second decoding sub-module includes:
the superposition unit is used for carrying out superposition operation on the target characteristic data of the (N-1) th coding level and the Nth decoding data to obtain a superposition result;
and the decoding unit is used for sequentially performing decoding operations on the superposition result through a first convolutional layer, a first anti-convolutional layer and a second convolutional layer to obtain the N-1 decoded data, wherein the number of output channels of the first convolutional layer is one fourth of the number of input channels, the number of input channels and the number of output channels of the first anti-convolutional layer are the same, and the number of output channels of the second convolutional layer is the same as the number of channels of the target characteristic data of the N-2 coding layer.
In this embodiment of the application, optionally, the decryption module further includes:
and the decoding submodule is used for sequentially carrying out the superposition operation and the decoding operation on the 1 st decoded data after the superposition operation and the decoding operation on the target feature data of each coding layer are sequentially finished, and obtaining the decoded data by sequentially carrying out the decoding operations on a second deconvolution layer, a third convolution layer and a fourth convolution layer, wherein the number of output channels of the second deconvolution layer is one half of the number of input channels, the number of input channels and the number of output channels of the third convolution layer are the same, and the number of output channels of the fourth convolution layer is 1.
In this embodiment of the present application, optionally, the merging module includes:
and the merging submodule is used for merging the channels of the characteristic data to obtain the target characteristic data.
In this embodiment of the application, optionally, the difference determining module includes:
the degree determining submodule is used for determining difference degree data on different positions between the first image and the second image according to the decoded data;
and the difference determining submodule is used for determining difference data between the first image and the second image according to the position information corresponding to the difference degree data meeting the preset requirement.
In this embodiment of the present application, optionally, the apparatus further includes:
and the training module is used for training a neural network for image processing by adopting the first image sample and the second image sample before the first image and the second image are respectively input into the corresponding coding networks for feature extraction to obtain feature data of different coding levels, and the neural network comprises a coding network and a decoding network.
According to the embodiment of the application, the first image and the second image are respectively input into the corresponding coding networks for feature extraction to obtain feature data of different coding levels, then merging the feature data of the same coding level to obtain target feature data of different coding levels, sequentially performing superposition operation and decoding operation on target feature data of different coding levels to obtain decoded data, according to the decoded data, determining the difference data between the first image and the second image, so that the two input images can be better distinguished during encoding, and the features of a plurality of encoding layers are fused during decoding, thereby not only utilizing the more abstract high-level features, but also utilizing the more image-like low-level features, and the characteristics of each intermediate level between the two images, the difference can be efficiently extracted from the two images, and the accuracy and efficiency of finding the difference between the images are improved.
Referring to fig. 8, a block diagram illustrating a structure of an embodiment of an image processing apparatus according to a sixth embodiment of the present application may specifically include:
the image receiving module 601 is configured to receive a first remote sensing image and a second remote sensing image of the same region at different times;
an extraction module 602, configured to input the first remote sensing image and the second remote sensing image into corresponding coding networks respectively to perform feature extraction, so as to obtain feature data of different coding levels;
a merging module 603, configured to merge feature data of the same coding level to obtain target feature data of different coding levels;
a decoding module 604, configured to perform superposition operation and decoding operation on the target feature data of different coding layers in sequence to obtain decoded data;
and the region determining module 605 is configured to determine a target region having a difference between the first remote sensing image and the second remote sensing image according to the decoded data.
According to the embodiment of the application, the first remote sensing image and the second remote sensing image in different time aiming at the same region are received, the first remote sensing image and the second remote sensing image are respectively input into corresponding coding networks for feature extraction to obtain feature data of different coding levels, the feature data of the same coding level are merged to obtain target feature data of different coding levels, the target feature data of different coding levels are sequentially subjected to superposition operation and decoding operation to obtain decoded data, a target region with difference between the first remote sensing image and the second remote sensing image is determined according to the decoded data, so that the two input remote sensing images can be better distinguished during coding, the features of a plurality of coding levels are fused during decoding, not only abstract high-level features can be utilized, but also more image low-level features can be utilized, and the characteristics of each intermediate level between the two remote sensing images, the difference can be efficiently extracted from the two remote sensing images, and the accuracy and efficiency of finding the areas with the difference between the remote sensing images are improved.
For the device embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, refer to the partial description of the method embodiment.
Embodiments of the disclosure may be implemented as a system using any suitable hardware, firmware, software, or any combination thereof, in a desired configuration. Fig. 9 schematically illustrates an exemplary system (or apparatus) 700 that can be used to implement various embodiments described in this disclosure.
For one embodiment, fig. 9 illustrates an exemplary system 700 having one or more processors 702, a system control module (chipset) 704 coupled to at least one of the processor(s) 702, a system memory 706 coupled to the system control module 704, a non-volatile memory (NVM)/storage 708 coupled to the system control module 704, one or more input/output devices 710 coupled to the system control module 704, and a network interface 712 coupled to the system control module 706.
The processor 702 may include one or more single-core or multi-core processors, and the processor 702 may include any combination of general-purpose or special-purpose processors (e.g., graphics processors, application processors, baseband processors, etc.). In some embodiments, the system 700 can function as a browser as described in embodiments herein.
In some embodiments, system 700 may include one or more computer-readable media (e.g., system memory 706 or NVM/storage 708) having instructions and one or more processors 702 in combination with the one or more computer-readable media configured to execute the instructions to implement modules to perform the actions described in this disclosure.
For one embodiment, system control module 704 may include any suitable interface controllers to provide any suitable interface to at least one of processor(s) 702 and/or any suitable device or component in communication with system control module 704.
The system control module 704 may include a memory controller module to provide an interface to the system memory 706. The memory controller module may be a hardware module, a software module, and/or a firmware module.
System memory 706 may be used to load and store data and/or instructions for system 700, for example. For one embodiment, system memory 706 may include any suitable volatile memory, such as suitable DRAM. In some embodiments, the system memory 706 may include a double data rate type four synchronous dynamic random access memory (DDR4 SDRAM).
For one embodiment, system control module 704 may include one or more input/output controllers to provide an interface to NVM/storage 708 and input/output device(s) 710.
For example, NVM/storage 708 may be used to store data and/or instructions. NVM/storage 708 may include any suitable non-volatile memory (e.g., flash memory) and/or may include any suitable non-volatile storage device(s) (e.g., one or more hard disk drive(s) (HDD (s)), one or more Compact Disc (CD) drive(s), and/or one or more Digital Versatile Disc (DVD) drive (s)).
NVM/storage 708 may include storage resources that are physically part of the device on which system 700 is installed or may be accessed by the device and not necessarily part of the device. For example, NVM/storage 708 may be accessible over a network via input/output device(s) 710.
Input/output device(s) 710 may provide an interface for system 700 to communicate with any other suitable device, input/output device(s) 710 may include communication components, audio components, sensor components, and the like. Network interface 712 may provide an interface for system 700 to communicate over one or more networks, and system 700 may communicate wirelessly with one or more components of a wireless network according to any of one or more wireless network standards and/or protocols, such as to access a communication standard-based wireless network, such as WiFi, 2G, or 3G, or a combination thereof.
For one embodiment, at least one of the processor(s) 702 may be packaged together with logic for one or more controller(s) (e.g., memory controller module) of system control module 704. For one embodiment, at least one of the processor(s) 702 may be packaged together with logic for one or more controller(s) of system control module 704 to form a System In Package (SiP). For one embodiment, at least one of the processor(s) 702 may be integrated on the same die with logic for one or more controller(s) of system control module 704. For one embodiment, at least one of the processor(s) 702 may be integrated on the same die with logic for one or more controller(s) of system control module 704 to form a system on a chip (SoC).
In various embodiments, system 700 may be, but is not limited to being: a browser, a workstation, a desktop computing device, or a mobile computing device (e.g., a laptop computing device, a handheld computing device, a tablet, a netbook, etc.). In various embodiments, system 700 may have more or fewer components and/or different architectures. For example, in some embodiments, system 700 includes one or more cameras, a keyboard, a Liquid Crystal Display (LCD) screen (including a touch screen display), a non-volatile memory port, multiple antennas, a graphics chip, an Application Specific Integrated Circuit (ASIC), and speakers.
Wherein, if the display includes a touch panel, the display screen may be implemented as a touch screen display to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.
The present application further provides a non-volatile readable storage medium, where one or more modules (programs) are stored in the storage medium, and when the one or more modules are applied to a terminal device, the one or more modules may cause the terminal device to execute instructions (instructions) of method steps in the present application.
In one example, a computer device is provided, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method according to the embodiments of the present application when executing the computer program.
There is also provided in one example a computer readable storage medium having stored thereon a computer program, characterized in that the program, when executed by a processor, implements a method as one or more of the embodiments of the application.
An embodiment of the application discloses an image processing method and an image processing device, and example 1 includes an image processing method, including:
respectively inputting the first image and the second image into corresponding coding networks for feature extraction to obtain feature data of different coding levels;
merging the feature data of the same coding level to obtain target feature data of different coding levels;
sequentially performing superposition operation and decoding operation on the target characteristic data of different coding levels to obtain decoded data;
determining difference data between the first image and the second image according to the decoded data.
Example 2 may include the method of example 1, wherein before the first image and the second image are respectively input to corresponding coding networks for feature extraction, so as to obtain feature data of different coding levels, the method further includes:
receiving two remote sensing images of the same area at different times as the first image and the second image.
Example 3 may include the method of example 1 and/or example 2, wherein the respectively inputting the first image and the second image into corresponding coding networks for feature extraction, and obtaining feature data of different coding levels includes:
extracting the characteristics of the first image and the second image in a corresponding coding network to obtain characteristic data of a 1 st coding level;
and performing feature extraction on the feature data of the (N-1) th coding level in the coding network to obtain the feature data of the Nth coding level.
Example 4 may include the method of one or more of examples 1-3, wherein the sequentially performing the superposition operation and the decoding operation on the target feature data of the different encoding levels to obtain decoded data includes:
decoding the target characteristic data of the Nth coding level to obtain Nth decoding data;
performing superposition operation and decoding operation on the target characteristic data of the (N-1) th coding level and the Nth decoding data to obtain N-1 th decoding data;
and iteratively executing the superposition operation and the decoding operation until the superposition operation and the decoding operation of the target characteristic data of each coding level are sequentially completed.
Example 5 may include the method of one or more of examples 1-4, wherein the performing the overlay operation and the decode operation on the target feature data of the N-1 th encoding hierarchy and the N-th decoded data to obtain the N-1 th decoded data comprises:
performing superposition operation on the target characteristic data of the (N-1) th coding level and the Nth decoding data to obtain a superposition result;
and sequentially performing decoding operations on the superposition result through a first convolution layer, a first deconvolution layer and a second convolution layer to obtain the N-1 decoding data, wherein the number of output channels of the first convolution layer is one fourth of the number of input channels, the number of input channels and the number of output channels of the first deconvolution layer are the same, and the number of output channels of the second convolution layer is the same as the number of channels of the target characteristic data of the N-2 coding level.
Example 6 may include the method of one or more of examples 1 to 5, wherein the iteratively performing the overlay operation and the decoding operation until the overlay operation and the decoding operation on the target feature data of different coding layers are sequentially completed, and the sequentially performing the overlay operation and the decoding operation on the target feature data of different coding layers to obtain the decoded data further includes:
and sequentially carrying out decoding operations on the 1 st decoded data by a second deconvolution layer, a third convolution layer and a fourth convolution layer to obtain the decoded data, wherein the number of output channels of the second deconvolution layer is half of the number of input channels, the number of input channels and the number of output channels of the third convolution layer are the same, and the number of output channels of the fourth convolution layer is 1.
Example 7 may include the method of one or more of examples 1 to 6, wherein the merging feature data of the same coding level to obtain target feature data of different coding levels includes:
and merging channels of the characteristic data to obtain the target characteristic data.
Example 8 may include the method of one or more of examples 1-7, wherein the determining, from the decoded data, the difference data between the first image and the second image comprises:
determining difference degree data at different positions between the first image and the second image according to the decoding data;
and determining difference data between the first image and the second image according to the position information corresponding to the difference degree data meeting the preset requirement.
Example 9 may include the method of one or more of examples 1 to 8, wherein before the first image and the second image are respectively input to corresponding coding networks for feature extraction, and feature data of different coding levels are obtained, the method further includes:
a neural network for image processing is trained using the first image samples and the second image samples, the neural network including an encoding network and a decoding network.
Example 10 includes an image processing method comprising:
receiving a first remote sensing image and a second remote sensing image which aim at the same area and are different in time;
respectively inputting the first remote sensing image and the second remote sensing image into corresponding coding networks for feature extraction to obtain feature data of different coding levels;
merging the feature data of the same coding level to obtain target feature data of different coding levels;
sequentially performing superposition operation and decoding operation on the target characteristic data of different coding levels to obtain decoded data;
and determining a target area with difference between the first remote sensing image and the second remote sensing image according to the decoded data.
Example 11 includes an image processing apparatus comprising a neural network comprising a first encoding network, a second encoding network, a decoding network, and a difference determination module;
the first encoding network is configured to: extracting the characteristics of the first image to obtain characteristic data of different coding levels;
the second encoding network is configured to: extracting the features of the second image to obtain feature data of different coding levels;
the decoding network is configured to: merging the feature data of the same coding level to obtain target feature data of different coding levels; sequentially performing superposition operation and decoding operation on the target characteristic data of different coding levels to obtain decoded data;
the discrepancy determining module is to: determining difference data between the first image and the second image according to the decoded data.
Example 12 may include the apparatus of example 11, wherein the apparatus further comprises:
and the image receiving module is used for receiving two remote sensing images aiming at the same area at different time as the first image and the second image before the first image and the second image are respectively input into the corresponding coding networks for feature extraction to obtain feature data of different coding levels.
Example 13 may include the apparatus of example 11 and/or example 12, wherein the extraction module includes:
the first extraction submodule is used for extracting the characteristics of the first image and the second image in a corresponding coding network to obtain the characteristic data of the 1 st coding level;
and the second extraction submodule is used for extracting the characteristics of the characteristic data of the (N-1) th coding level in the coding network to obtain the characteristic data of the Nth coding level.
Example 14 may include the apparatus of one or more of examples 11-13, wherein the decryption module comprises:
the first decoding submodule is used for carrying out decoding operation on the target characteristic data of the Nth coding level to obtain Nth decoding data;
the second decoding submodule is used for carrying out superposition operation and decoding operation on the target characteristic data of the (N-1) th coding level and the Nth decoding data to obtain the (N-1) th decoding data;
and the iteration submodule is used for iteratively executing the superposition operation and the decoding operation until the superposition operation and the decoding operation of the target characteristic data of each coding level are sequentially completed.
Example 15 may include the apparatus of one or more of examples 11-14, wherein the second decoding sub-module comprises:
the superposition unit is used for carrying out superposition operation on the target characteristic data of the (N-1) th coding level and the Nth decoding data to obtain a superposition result;
and the decoding unit is used for sequentially performing decoding operations on the superposition result through a first convolutional layer, a first anti-convolutional layer and a second convolutional layer to obtain the N-1 decoded data, wherein the number of output channels of the first convolutional layer is one fourth of the number of input channels, the number of input channels and the number of output channels of the first anti-convolutional layer are the same, and the number of output channels of the second convolutional layer is the same as the number of channels of the target characteristic data of the N-2 coding layer.
Example 16 may include the apparatus of one or more of examples 11-15, wherein the decryption module further comprises:
and the decoding submodule is used for sequentially carrying out the superposition operation and the decoding operation on the 1 st decoded data after the superposition operation and the decoding operation on the target feature data of each coding layer are sequentially finished, and obtaining the decoded data by sequentially carrying out the decoding operations on a second deconvolution layer, a third convolution layer and a fourth convolution layer, wherein the number of output channels of the second deconvolution layer is one half of the number of input channels, the number of input channels and the number of output channels of the third convolution layer are the same, and the number of output channels of the fourth convolution layer is 1.
Example 17 may include the apparatus of one or more of examples 11-16, wherein the means for merging comprises:
and the merging submodule is used for merging the channels of the characteristic data to obtain the target characteristic data.
Example 18 may include the apparatus of one or more of examples 11-17, wherein the discrepancy determining module comprises:
the degree determining submodule is used for determining difference degree data on different positions between the first image and the second image according to the decoded data;
and the difference determining submodule is used for determining difference data between the first image and the second image according to the position information corresponding to the difference degree data meeting the preset requirement.
Example 19 may include the apparatus of one or more of examples 11-18, wherein the apparatus further comprises:
and the training module is used for training a neural network for image processing by adopting the first image sample and the second image sample before the first image and the second image are respectively input into the corresponding coding networks for feature extraction to obtain feature data of different coding levels, and the neural network comprises a coding network and a decoding network.
Example 20 includes an image processing apparatus comprising:
the image receiving module is used for receiving a first remote sensing image and a second remote sensing image which aim at the same region and are different in time;
the extraction module is used for respectively inputting the first remote sensing image and the second remote sensing image into corresponding coding networks for feature extraction to obtain feature data of different coding levels;
the merging module is used for merging the feature data of the same coding level to obtain target feature data of different coding levels;
the decoding module is used for sequentially performing superposition operation and decoding operation on the target characteristic data of different coding layers to obtain decoded data;
and the region determining module is used for determining a target region with difference between the first remote sensing image and the second remote sensing image according to the decoded data.
Example 21 includes a computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing a method as in one or more of examples 1-10 when executing the computer program.
Example 22 includes a computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements a method as in one or more of examples 1-10.
Although certain examples have been illustrated and described for purposes of description, a wide variety of alternate and/or equivalent implementations, or calculations, may be made to achieve the same objectives without departing from the scope of practice of the present application. This application is intended to cover any adaptations or variations of the embodiments discussed herein. Therefore, it is manifestly intended that the embodiments described herein be limited only by the claims and the equivalents thereof.

Claims (13)

1. An image processing method, comprising:
respectively inputting the first image and the second image into corresponding coding networks for feature extraction to obtain feature data of different coding levels;
merging the feature data of the same coding level to obtain target feature data of different coding levels;
sequentially performing superposition operation and decoding operation on the target characteristic data of different coding levels to obtain decoded data;
determining difference data between the first image and the second image according to the decoded data.
2. The method according to claim 1, wherein before the first image and the second image are respectively input to corresponding coding networks for feature extraction, so as to obtain feature data of different coding levels, the method further comprises:
receiving two remote sensing images of the same area at different times as the first image and the second image.
3. The method according to claim 1, wherein the step of inputting the first image and the second image into corresponding coding networks respectively for feature extraction to obtain feature data of different coding levels comprises:
extracting the characteristics of the first image and the second image in a corresponding coding network to obtain characteristic data of a 1 st coding level;
and performing feature extraction on the feature data of the (N-1) th coding level in the coding network to obtain the feature data of the Nth coding level.
4. The method of claim 3, wherein the sequentially performing the superposition operation and the decoding operation on the target feature data of the different encoding levels to obtain decoded data comprises:
decoding the target characteristic data of the Nth coding level to obtain Nth decoding data;
performing superposition operation and decoding operation on the target characteristic data of the (N-1) th coding level and the Nth decoding data to obtain N-1 th decoding data;
and iteratively executing the superposition operation and the decoding operation until the superposition operation and the decoding operation of the target characteristic data of each coding level are sequentially completed.
5. The method of claim 4, wherein the performing the superposition operation and the decoding operation on the target feature data of the N-1 th encoding level and the nth decoded data to obtain the N-1 th decoded data comprises:
performing superposition operation on the target characteristic data of the (N-1) th coding level and the Nth decoding data to obtain a superposition result;
and sequentially performing decoding operations on the superposition result through a first convolution layer, a first deconvolution layer and a second convolution layer to obtain the N-1 decoding data, wherein the number of output channels of the first convolution layer is one fourth of the number of input channels, the number of input channels and the number of output channels of the first deconvolution layer are the same, and the number of output channels of the second convolution layer is the same as the number of channels of the target characteristic data of the N-2 coding level.
6. The method of claim 5, wherein after the iteratively performing the superposition operation and the decoding operation until the superposition operation and the decoding operation of the target feature data of each coding level are sequentially completed, the sequentially performing the superposition operation and the decoding operation on the target feature data of different coding levels to obtain decoded data further comprises:
and sequentially carrying out decoding operations on the 1 st decoded data by a second deconvolution layer, a third convolution layer and a fourth convolution layer to obtain the decoded data, wherein the number of output channels of the second deconvolution layer is half of the number of input channels, the number of input channels and the number of output channels of the third convolution layer are the same, and the number of output channels of the fourth convolution layer is 1.
7. The method of claim 1, wherein the merging the feature data of the same coding level to obtain the target feature data of different coding levels comprises:
and merging channels of the characteristic data to obtain the target characteristic data.
8. The method of claim 1, wherein determining difference data between the first image and the second image from the decoded data comprises:
determining difference degree data at different positions between the first image and the second image according to the decoding data;
and determining difference data between the first image and the second image according to the position information corresponding to the difference degree data meeting the preset requirement.
9. The method according to claim 1, wherein before the first image and the second image are respectively input to corresponding coding networks for feature extraction, so as to obtain feature data of different coding levels, the method further comprises:
a neural network for image processing is trained using the first image samples and the second image samples, the neural network including an encoding network and a decoding network.
10. An image processing method, comprising:
receiving a first remote sensing image and a second remote sensing image which aim at the same area and are different in time;
respectively inputting the first remote sensing image and the second remote sensing image into corresponding coding networks for feature extraction to obtain feature data of different coding levels;
merging the feature data of the same coding level to obtain target feature data of different coding levels;
sequentially performing superposition operation and decoding operation on the target characteristic data of different coding levels to obtain decoded data;
and determining a target area with difference between the first remote sensing image and the second remote sensing image according to the decoded data.
11. An image processing apparatus comprising a neural network, the neural network comprising a first encoding network, a second encoding network, a decoding network, and a difference determination module;
the first encoding network is configured to: extracting the characteristics of the first image to obtain characteristic data of different coding levels;
the second encoding network is configured to: extracting the features of the second image to obtain feature data of different coding levels;
the decoding network is configured to: merging the feature data of the same coding level to obtain target feature data of different coding levels; sequentially performing superposition operation and decoding operation on the target characteristic data of different coding levels to obtain decoded data;
the discrepancy determining module is to: determining difference data between the first image and the second image according to the decoded data.
12. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method according to one or more of claims 1-10 when executing the computer program.
13. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the method according to one or more of claims 1-10.
CN201910792402.2A 2019-08-26 2019-08-26 Image processing method, image processing device, computer equipment and storage medium Pending CN112434702A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910792402.2A CN112434702A (en) 2019-08-26 2019-08-26 Image processing method, image processing device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910792402.2A CN112434702A (en) 2019-08-26 2019-08-26 Image processing method, image processing device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112434702A true CN112434702A (en) 2021-03-02

Family

ID=74690077

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910792402.2A Pending CN112434702A (en) 2019-08-26 2019-08-26 Image processing method, image processing device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112434702A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095303A (en) * 2021-06-04 2021-07-09 成都数之联科技有限公司 Model training method, forest land change detection system, forest land change detection device and forest land change detection medium
CN113724205A (en) * 2021-08-09 2021-11-30 浙江大华技术股份有限公司 Image change detection method, apparatus and storage medium
CN115131641A (en) * 2022-06-30 2022-09-30 北京百度网讯科技有限公司 Image recognition method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101969567A (en) * 2010-11-09 2011-02-09 北京工业大学 Image coding method based on total variation
CN107920248A (en) * 2016-10-11 2018-04-17 京东方科技集团股份有限公司 Arrangement for encoding, image processing system, training method and display device
CN108335322A (en) * 2018-02-01 2018-07-27 深圳市商汤科技有限公司 Depth estimation method and device, electronic equipment, program and medium
CN108960345A (en) * 2018-08-08 2018-12-07 广东工业大学 A kind of fusion method of remote sensing images, system and associated component
US10311321B1 (en) * 2018-10-26 2019-06-04 StradVision, Inc. Learning method, learning device using regression loss and testing method, testing device using the same
CN109886072A (en) * 2018-12-25 2019-06-14 中国科学院自动化研究所 Face character categorizing system based on two-way Ladder structure
EP3525468A1 (en) * 2016-10-04 2019-08-14 Ki Baek Kim Image data encoding/decoding method and apparatus

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101969567A (en) * 2010-11-09 2011-02-09 北京工业大学 Image coding method based on total variation
EP3525468A1 (en) * 2016-10-04 2019-08-14 Ki Baek Kim Image data encoding/decoding method and apparatus
CN107920248A (en) * 2016-10-11 2018-04-17 京东方科技集团股份有限公司 Arrangement for encoding, image processing system, training method and display device
US20190014320A1 (en) * 2016-10-11 2019-01-10 Boe Technology Group Co., Ltd. Image encoding/decoding apparatus, image processing system, image encoding/decoding method and training method
CN108335322A (en) * 2018-02-01 2018-07-27 深圳市商汤科技有限公司 Depth estimation method and device, electronic equipment, program and medium
CN108960345A (en) * 2018-08-08 2018-12-07 广东工业大学 A kind of fusion method of remote sensing images, system and associated component
US10311321B1 (en) * 2018-10-26 2019-06-04 StradVision, Inc. Learning method, learning device using regression loss and testing method, testing device using the same
CN109886072A (en) * 2018-12-25 2019-06-14 中国科学院自动化研究所 Face character categorizing system based on two-way Ladder structure

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
KOLOS M等: "procedural synthesis of remote sensing Images for robust change detection with neural networks", 《INTERNATIONAL SYMPOSIUM ON NEURAL NETWORKS》 *
RODRIGO CAYE DAUDT等: "Fully Convolutional Siamese Networks for Change Detection", 《2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP)》 *
冯春凤等: "基于堆叠稀疏自动编码器的SAR图像变化检测", 《激光杂志》 *
李倩兰: "基于层次化自编码的异质遥感图像融合与变化检测", 《中国优秀博硕士学位论文全文数据库(硕士)工程科技Ⅱ辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113095303A (en) * 2021-06-04 2021-07-09 成都数之联科技有限公司 Model training method, forest land change detection system, forest land change detection device and forest land change detection medium
CN113095303B (en) * 2021-06-04 2021-09-28 成都数之联科技有限公司 Model training method, forest land change detection system, forest land change detection device and forest land change detection medium
CN113724205A (en) * 2021-08-09 2021-11-30 浙江大华技术股份有限公司 Image change detection method, apparatus and storage medium
CN115131641A (en) * 2022-06-30 2022-09-30 北京百度网讯科技有限公司 Image recognition method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US10909682B2 (en) Method and device for detecting pulmonary nodule in computed tomography image, and computer-readable storage medium
CN110136056B (en) Method and device for reconstructing super-resolution image
CN108876792B (en) Semantic segmentation method, device and system and storage medium
US11138738B2 (en) Image processing method and image processing device
US20200117906A1 (en) Space-time memory network for locating target object in video content
US20180232853A1 (en) Image conversion device and image conversion method therefor
CN112084923B (en) Remote sensing image semantic segmentation method, storage medium and computing device
US20190228508A1 (en) Digital Image Fill
US11010883B2 (en) Automated analysis of petrographic thin section images using advanced machine learning techniques
CN112434702A (en) Image processing method, image processing device, computer equipment and storage medium
CN111127468A (en) Road crack detection method and device
EP4053718A1 (en) Watermark information embedding method and apparatus
JP2013225312A5 (en)
US20170251194A1 (en) Apparatus for generating a three-dimensional color image and a method for producing a three-dimensional color image
CN113807361B (en) Neural network, target detection method, neural network training method and related products
CN113724128A (en) Method for expanding training sample
CN115272250B (en) Method, apparatus, computer device and storage medium for determining focus position
CN115147606B (en) Medical image segmentation method, medical image segmentation device, computer equipment and storage medium
Chen et al. Alfpn: adaptive learning feature pyramid network for small object detection
Das et al. Extracting building footprints from high-resolution aerial imagery using refined cross AttentionNet
CN116977336A (en) Camera defect detection method, device, computer equipment and storage medium
CN115619924A (en) Method and apparatus for light estimation
CN112052861B (en) Method for calculating effective receptive field of deep convolutional neural network and storage medium
CN113657245A (en) Method, device, medium and program product for human face living body detection
CN115965856B (en) Image detection model construction method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210302