Disclosure of Invention
The invention aims to provide a hyperspectral panchromatic sharpening method based on a depth detail injection network, and solves the problem that in the prior art, the fusion effect is limited due to insufficient detail extraction in a hyperspectral fusion process.
The technical scheme adopted by the invention is that the hyperspectral panchromatic sharpening method based on the depth detail injection network is implemented according to the following steps:
step 1, selecting data sets of two hyperspectral images, wherein the data sets of the two hyperspectral images respectively cover an indoor scene and an outdoor scene, the indoor scene is represented by a Cave data set, and the outdoor scene is represented by a PaviaCenter data set;
step 2, performing up-sampling on the low-resolution hyperspectral image in the data set in the step 1, combining the hyperspectral image with the full-color image, and inputting the combined image into the convolutional layer to extract the shallow feature of the combined image;
3, sending the shallow layer features extracted in the step 2 to the convolutional layer again, and further extracting the shallow layer features; then inputting the shallow layer characteristics extracted for the second time into a residual error dense block network; finally, performing primary global feature fusion on all residual error dense blocks to obtain the hierarchical features of the combined image;
step 4, performing residual error operation on the shallow layer characteristics obtained in the step 2 and the hierarchical characteristics obtained in the step 3; and finally, carrying out convolution operation once to obtain a fusion result of the hyperspectral panchromatic sharpening method based on the depth detail injection network.
The present invention is also characterized in that,
the step 1 is as follows:
step 1.1, adopting a Cave data set to represent an indoor scene, and adopting a Pavia Center data set to represent an outdoor scene; the original hyperspectral image in the data set is used as a reference image, the simulated low-resolution hyperspectral image is obtained by down-sampling the reference image, and the simulated high-resolution panchromatic image is obtained by averaging the third dimension of the reference image;
step 1.2, respectively dividing a training set, a verification set and a test set for the two data sets, namely a Cave data set and a Pavia Center data set, wherein the number of images in each training set is 80% of the whole image data set, the number of images in each test set is 10% of the whole image data set, and the number of images in each verification set is 10% of the whole image data set;
and 1.3, dividing the data set, performing data preprocessing, and uniformly adjusting the size of the image to 64 multiplied by 64.
The step 2 is as follows:
step 2.1, performing up-sampling on the low-resolution hyperspectral image obtained in the step 1.1, as shown in formula (1):
in the formula: HS
bRepresenting a low-resolution hyperspectral image, wherein b is 1,2,3, and n is the number of wave bands of the hyperspectral image; f. of
upRefers to a bicubic interpolation function of relative multiple of a hyperspectral image with low spatial resolution,
representing an upsampled hyperspectral image;
step 2.2, performing combined operation on the high-resolution full-color image obtained in the step 1.1 and the hyperspectral image obtained in the step 2.1 to obtain a combined image with n +1 wave band numbers, wherein the combined process is shown as a formula (2):
in the formula: b ═ 1,2,3,. n, n + 1; as indicated by a full color image and
the PAN represents a full-color image,
representing a joint image;
and 2.3, extracting shallow features of the combined image by using a 3 × 3 convolutional layer in the preprocessing step, wherein the formula (3) is as follows:
F-1=fCONV(HSin), (3)
in the formula: f. ofCONVRepresents a convolution operation; f-1And the shallow feature of the combined image is represented and simultaneously used as an input for extracting the hierarchical feature, and the global residual error learning is also used.
Step 3, the shallow features of the combined image extracted in the step 2 are sent to the convolutional layer again for re-extraction; then inputting the shallow layer characteristics extracted for the second time into a residual error dense block network; and finally, performing one-time global feature fusion on all residual error dense blocks, which specifically comprises the following steps:
step 3.1, performing convolution operation of 3 × 3 on the shallow layer feature extracted in step 2, as shown in formula (4):
F0=fCONV(F-1), (4)
in the formula: f0Representing shallow features extracted by performing a second convolution operation on the joint image;
step 3.2, assuming that the residual error dense blocks RDBs include D residual error dense blocks RDB and the D-th residual error dense block RDB, as shown in formula (5), in one residual error dense block RDB, first, the state of the previous residual error dense block RDB needs to be transferred to each convolution layer and rectification linear unit in the current residual error dense block RDB, as shown in formula (6):
in the formula:
a complex function representing the d-th residual dense block RDB;
representing the weight of the ith layer in the residual dense block RDB;
an ith convolution layer representing a d-th residual dense block RDB; i ═ 1,2,3,. ·, I, ReLu stands for activation function;
step 3.3, secondly, the state of the previous residual error dense block RDB and the state of the whole convolution layer in the current residual error dense block RDB are connected and adaptively fused together through convolution of 1 × 1, as shown in formula (1):
in the formula:
represents the 1 × 1 convolutional layer function in the d RDB, F
d,LFLocal features representing the d-th residual dense block RDB;
step 3.4, finally, a formula (8) is obtained by summing the local features of the previous residual error dense block RDB and the d-th residual error dense block RDB obtained in step 3.3, as follows:
Fd=Fd-1+Fd,LF, (8)
in the formula: fd-1Represents the d-1 th residual dense block RDB, FdRepresents the d-th residual dense block RDB;
step 3.5, performing global feature fusion on all residual error dense blocks RDBs obtained in the step 3.4, namely adaptively fusing features extracted by all RDBs together to obtain global features, wherein the process is defined as formula (9):
FGFF=HGFF([F1......,FD]), (9)
in the formula: [ f ] of1......,FD]Is the result of the 1 st RDB through the D < th > RDB, HGFFIs a function of the global feature fusion, FGFFRepresenting the hierarchical features of the joint image.
Step 4, carrying out residual error operation on the shallow layer characteristics obtained in the step 2.3 and the hierarchical characteristics obtained in the step 3; and finally, carrying out convolution operation once to obtain a high-resolution hyperspectral image, which specifically comprises the following steps:
the summation operation process of the shallow feature obtained in step 4.1 and step 2.3 and the hierarchical feature obtained in step 3 is shown in formula (10):
FRes=F-1+FGFF, (10)
in the formula: f-1And FGFFRespectively representing shallow and hierarchical features of the joint image, FResDense features representing the joint image;
step 4.2, performing convolution operation of 3 × 3 on the dense features of the combined image obtained in step 4.1 to obtain a high-resolution hyperspectral image with a wave band of n, as shown in formula (11):
HSfus=fconv(FRes), (11)
in the formula: fResDense features representing the joint image; HSfusRepresenting a high-resolution hyperspectral image.
The hyperspectral panchromatic sharpening method based on the depth detail injection network has the advantages that the residual error dense network is adopted to fully utilize all the hierarchical features of the input image, then the residual error dense block is used for extracting the local features, and finally the global residual error learning is used for combining the shallow features and the deep features together to obtain the hyperspectral image with high spatial resolution and high spectral resolution.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and specific embodiments.
The invention relates to a hyperspectral panchromatic sharpening method based on a depth detail injection network, which is implemented by the following steps in detail by combining a flow chart shown in figure 1 and figures 2-3:
step 1, selecting data sets of two hyperspectral images, wherein the data sets of the two hyperspectral images respectively cover an indoor scene and an outdoor scene, the indoor scene is represented by a Cave data set, and the outdoor scene is represented by a PaviaCenter data set;
the step 1 is as follows:
step 1.1, constructing a data set of a hyperspectral image, and adopting a Cave data set to represent an indoor scene and a Pavia Center data set to represent an outdoor scene; according to the Wald protocol, an original hyperspectral image in a data set is used as a reference image, a simulated low-resolution hyperspectral image is obtained by down-sampling the reference image, and a simulated high-resolution panchromatic image is obtained by averaging the third dimension of the reference image;
step 1.2, respectively dividing a training set, a verification set and a test set for the two data sets, namely a Cave data set and a Pavia Center data set, wherein the number of images in each training set is 80% of the whole image data set, the number of images in each test set is 10% of the whole image data set, and the number of images in each verification set is 10% of the whole image data set;
step 1.3, after the data set is divided, data preprocessing is carried out, and in order to guarantee feasibility of code operation, the size of the image is uniformly adjusted to be 64 x 64.
Step 2, performing up-sampling on the low-resolution hyperspectral image in the data set in the step 1, combining the hyperspectral image with the full-color image, inputting the combined image into the convolutional layer, and extracting the shallow layer feature F of the combined image-1;
Step 2, as shown in fig. 1, the low-resolution hyperspectral image in the data in step 1 is firstly up-sampled, then combined with the full-color image, and input into the convolutional layer, specifically as follows:
step 2.1, performing up-sampling on the low-resolution hyperspectral image obtained in the step 1.1, as shown in formula (1):
in the formula: HS
bRepresenting a low-resolution hyperspectral image, wherein b is 1,2,3, and n is the number of wave bands of the hyperspectral image; fup refers to a bicubic interpolation function of relative multiples of a hyperspectral image with low spatial resolution,
representing an upsampled hyperspectral image;
step 2.2, performing combined operation on the high-resolution full-color image obtained in the step 1.1 and the hyperspectral image obtained in the step 2.1 to obtain a combined image with n +1 wave band numbers, wherein the combined process is shown as a formula (2):
in the formula: b ═ 1,2,3,. n, n + 1; as indicated by a full color image and
the PAN represents a full-color image,
representing a joint image;
and 2.3, extracting shallow features of the combined image by using a 3 × 3 convolutional layer in the preprocessing step, wherein the formula (3) is as follows:
F-1=fCONV(HSin), (3)
in the formula: f. ofCONVRepresents a convolution operation; f-1And the shallow feature of the combined image is represented and simultaneously used as an input for extracting the hierarchical feature, and the global residual error learning is also used.
Step 3, sending the shallow feature of the combined image extracted in the step 2 to the convolution layer again for extracting the shallow feature again; then inputting the shallow layer characteristics extracted for the second time into a residual error dense block network; finally, performing global feature fusion on all residual dense blocks once, as shown in fig. 1, specifically as follows:
step 3.1, performing convolution operation of 3 × 3 on the shallow layer feature extracted in step 2, as shown in formula (4):
F0=fCONV(F-1), (4)
in the formula: f0Representing shallow features extracted by performing a second convolution operation on the joint image;
step 3.2, assuming that the residual error dense blocks RDBs include D residual error dense blocks RDB and the D-th residual error dense block RDB, as shown in formula (5), in one residual error dense block RDB, first, the state of the previous residual error dense block RDB needs to be transferred to each convolution layer and rectification linear unit in the current residual error dense block RDB, as shown in formula (6):
in the formula:
a complex function representing the d-th residual dense block RDB;
representing the weight of the ith layer in the residual dense block RDB;
an ith convolution layer representing a d-th residual dense block RDB; i ═ 1,2,3,. ·, I, ReLu stands for activation function;
step 3.3, secondly, the state of the previous residual error dense block RDB and the state of the whole convolution layer in the current residual error dense block RDB are connected and adaptively fused together through convolution of 1 × 1, as shown in formula (1):
in the formula:
represents the 1 × 1 convolutional layer function in the d RDB, F
d,LFLocal features representing the d-th residual dense block RDB;
step 3.4, finally, a formula (8) is obtained by summing the local features of the previous residual error dense block RDB and the d-th residual error dense block RDB obtained in step 3.3, as follows:
Fd=Fd-1+Fd,LF, (8)
in the formula: fd-1Represents the d-1 th residual dense block RDB, FdRepresents the d-th residual dense block RDB;
step 3.5, performing global feature fusion on all residual error dense blocks RDBs obtained in the step 3.4, namely adaptively fusing features extracted by all RDBs together to obtain global features, wherein the global feature fusion comprises convolution operations of 1 × 1 and 3 × 3, and the convolution of 1 × 1 is to fuse a series of features; the 3x3 convolution is prepared for further feature extraction for the next global residual learning. The above process is defined as formula (9):
FGFF=HGFF([f1…...,FD]), (9)
in the formula: [ F ]1......,FD]Is the result of the 1 st RDB through the D < th > RDB, HGFFIs a function of the global feature fusion, FGFFRepresenting the hierarchical features of the joint image.
Step 4, performing residual error operation on the shallow layer characteristics obtained in the step 2 and the hierarchical characteristics obtained in the step 3; and finally, carrying out convolution operation once to obtain a fusion result of the hyperspectral panchromatic sharpening method based on the depth detail injection network.
Step 4, as shown in fig. 1, performing residual error operation on the shallow feature obtained in step 2.3 and the hierarchical feature obtained in step 3; and finally, carrying out convolution operation once to obtain a high-resolution hyperspectral image, which specifically comprises the following steps:
the summation operation process of the shallow feature obtained in step 4.1 and step 2.3 and the hierarchical feature obtained in step 3 is shown in formula (10):
FRes=F-1+FGFF, (10)
in the formula: f-1And FGFFRespectively representing shallow and hierarchical features of the joint image, FResDense features representing the joint image;
step 4.2, performing convolution operation of 3 × 3 on the dense features of the combined image obtained in step 4.1 to obtain a high-resolution hyperspectral image with a wave band of n, as shown in formula (11):
HSfus=fconv(FRes), (11)
in the formula: fResDense features representing the joint image; HSfusRepresenting a high-resolution hyperspectral image.
In addition, in order to perform comprehensive evaluation on the fused image, subjective evaluation and objective evaluation are adopted. Objective evaluation indices include Cross-Correlation (CC), Spectral Angle Mapping (SAM), Root-Mean-Squared Error (RMSE), Relative global dimensional composite Error (Erreal Relative Global evaluation nelle De Synth se, ERGAS), Peak Signal Noise Ratio (PSNR), and Structural Similarity (SSIM).
The invention herein is compared to the following six algorithms: bicubic (bicubic), a self-adaptive Schmidt orthogonal variation method (GSA), a Laplacian Pyramid method based on a Generalized modulation transfer function (MTF-Generalized Laplacian Pyramid, MTF-GLP), a Guided filter principal component analysis method (GFPCA), Coupled nonnegative matrix decomposition (mf cnc), and a panchromatic sharpening method based on a convolutional neural network (PNN).
For subjective evaluation, Flowers images in the Cave test set are selected, sampling factors are 8, and the images are presented by using the thirtieth wave band of the fused image. Fig. 2(a) is a reference diagram of Flowers, fig. 2(b) is a result diagram of a bicubic (bicubic) method of performing bicubic interpolation on Flowers, fig. 2(c) is a result diagram of a GSA (GSA) method of performing adaptive schmitt orthogonal transformation on Flowers, fig. 2(d) is a result diagram of a laplacian pyramid method (MTF-GLP) based on a generalized modulation transfer function on Flowers, fig. 2(e) is a result diagram of a guided filter principal component analysis method (GFPCA) on Flowers, fig. 2(f) is a result diagram of a coupled nonnegative matrix decomposition method (CNMF) on Flowers, fig. 2(g) is a result diagram of a PNN (PNN) method of performing convolutional neural network-based panchromatic sharpening on Flowers, and fig. 2(h) is a result diagram of a hyperspectral panchromatic method of performing depth-detail injection network-based on Flowers. As can be seen from fig. 2, the hyperspectral image of Bicubic appears with a large amount of blur; the GSA fused hyperspectral image has certain spectral distortion; the MTF-GLP fused hyperspectral image loses some spatial details; the GFPCA fused hyperspectral image is fuzzy in some detail areas; the CMNF and PNN fused hyperspectral image can be well maintained on spectral information and spatial information; the fused image of the algorithm provided by the invention effectively maintains the spectral information and enhances the spatial information. The average values for the Cave dataset using various fusion algorithms with sampling factors of 2, 4 and 8 are listed in table 1, with the optimal values being shown in bold. From the experimental results, it can be seen that the values of CC, PSNR and SSIM are the largest and the values of SAM, RMSE and ERGAS are the smallest for the present algorithm, regardless of the sampling factor.
TABLE 1 mean values of various fusion algorithms for Cave datasets using different sampling factors
Fig. 3 is a visualization result of 102 th band of a fused image obtained by a Pavia Center test set using different fusion algorithms with a sampling factor of 2, fig. 3(a) is a reference diagram of the test set, fig. 3(b) is a result diagram of a bicubic method of bicubic interpolation performed on the test set, fig. 3(c) is a result diagram of an adaptive schmidt orthogonal transformation method (GSA) performed on the test set, fig. 3(d) is a result diagram of a laplacian pyramid method based on a generalized modulation transfer function (MTF-GLP) performed on the test set, fig. 3(e) is a result diagram of a guided filtering principal component analysis method (GFPCA) performed on the test set, fig. 3(f) is a result diagram of a coupled non-negative matrix factorization method (CNMF) performed on the test set, fig. 3(g) is a result diagram of a full color sharpening method based on a convolutional neural network (PNN) performed on the test set, and fig. 3(h) is a hyperspectral method of a full color sharpening method based on depth injection network performed on the test set And (5) fruit pictures. The results of Bicubic appear to be a blurring of the patch in detail; the results of GSA, CMNF and PNN fusion are somewhat detailed lost; slight spectral distortion appears as a result of MTF-GLP fusion, and partial edge information is lost; the result of GFPCA fusion appears unclear in detail information and the image is dark overall. Compared to the above algorithms, the algorithms presented herein are enhanced in both spectral and spatial aspects. Table 2 is an objective evaluation index for various fusion algorithms using various sampling factors for the Pavia dataset. As can be seen from the table, each evaluation index of the algorithm herein is better than the optimum value and the value of PNN in the conventional method.
Table 2 mean values of various fusion algorithms using different sampling factors for the PaviaCenter dataset