CN111861886B

CN111861886B - An image super-resolution reconstruction method based on multi-scale feedback network

Info

Publication number: CN111861886B
Application number: CN202010682515.XA
Authority: CN
Inventors: 陈晓; 孙超文
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2020-07-15
Filing date: 2020-07-15
Publication date: 2023-08-08
Anticipated expiration: 2040-07-15
Also published as: CN111861886A

Abstract

The invention relates to an image super-resolution reconstruction method based on a multi-scale feedback network, which comprises the following steps: (1) creating an image dataset; (2) Extracting features of an input image, recursively realizing low-resolution and high-resolution feature mapping by using a multi-scale upper projection unit and a multi-scale lower projection unit to obtain high-resolution feature images with different depths, performing convolution calculation on the high-resolution feature images to obtain residual images, and finally interpolating the low-resolution images and adding the residual images to obtain output images; (3) Training a multi-scale feedback network by using the data set, and generating a trained network model; (4) And inputting the low-resolution image to be processed into a trained network to obtain an output high-resolution image. The method can train networks with different depths and expand to other amplification factors through small parameter adjustment, saves training cost, can realize amplification with larger amplification factors, and improves peak signal-to-noise ratio and structural similarity of reconstructed images.

Description

An image super-resolution reconstruction method based on multi-scale feedback network

技术领域technical field

本发明涉及一种基于多尺度反馈网络的图像超分辨率重建方法，属于计算机视觉领域和深度学习领域。The invention relates to an image super-resolution reconstruction method based on a multi-scale feedback network, belonging to the fields of computer vision and deep learning.

背景技术Background technique

图像超分辨率(Super-resolution,SR)重建技术是计算机视觉领域中一项重要的图像处理技术，广泛应用于医学成像、安全监控、改善遥感图像质量、图像压缩和目标检测领域。图像超分辨率重建旨在建立一个合适的模型把低分辨率(Low Resolution,LR)图像转换成相应的高分辨率图像(High Resolution,HR)。由于一个给定的LR图像输入对应多个可能的HR图像，因此SR重建问题是一个具有挑战性的病态逆问题。Image super-resolution (SR) reconstruction technology is an important image processing technology in the field of computer vision, which is widely used in medical imaging, security monitoring, improving remote sensing image quality, image compression and target detection. Image super-resolution reconstruction aims to establish a suitable model to convert a low-resolution (Low Resolution, LR) image into a corresponding high-resolution image (High Resolution, HR). Since a given LR image input corresponds to multiple possible HR images, the SR reconstruction problem is a challenging ill-conditioned inverse problem.

目前，提出的SR重建方法主要分为三大类，分别为基于插值的方法、基于重建的方法和基于学习的方法。其中，基于深度学习的SR方法以其优越的重建性能近年来受到广泛关注。SRCNN作为深度学习技术SR领域的开山之作，充分展示了卷积神经网络的优越性。因此很多网络以SRCNN架构为基准提出了一系列基于卷积神经网络的SR方法。深度作为一个重要因素可以为网络提供更大的感受野和更多的上下文信息，然而增加深度却极易引发两个问题：梯度消失/爆炸和大量的网络参数。Currently, the proposed SR reconstruction methods are mainly divided into three categories, which are interpolation-based methods, reconstruction-based methods, and learning-based methods. Among them, the SR method based on deep learning has attracted extensive attention in recent years due to its superior reconstruction performance. As a pioneering work in the field of deep learning technology SR, SRCNN fully demonstrates the superiority of convolutional neural networks. Therefore, many networks have proposed a series of SR methods based on convolutional neural networks based on the SRCNN architecture. As an important factor, depth can provide the network with a larger receptive field and more contextual information. However, increasing the depth can easily cause two problems: gradient disappearance/explosion and a large number of network parameters.

为了解决梯度问题，研究者提出残差学习，并成功训练了更深的网络，此外，也有一些网络引入密集连接来缓解梯度消失问题并鼓励特征重用；为了减少参数，研究者提出了递归学习来帮助权重共享。得益于这些机制，很多网络都倾向于构造更深更复杂的网络结构以获得更高的评价指标，然而经研究发现，目前很多网络都存在以下问题：In order to solve the gradient problem, researchers proposed residual learning and successfully trained deeper networks. In addition, some networks introduced dense connections to alleviate the problem of gradient disappearance and encourage feature reuse; in order to reduce parameters, researchers proposed recursive learning to help Weight sharing. Thanks to these mechanisms, many networks tend to construct deeper and more complex network structures to obtain higher evaluation indicators. However, research has found that many networks currently have the following problems:

第一、很多SR方法虽然实现了深度网络的高性能，却忽略了网络的训练难度，导致需要花费庞大的训练集，投入更多的训练技巧和时间。First, although many SR methods achieve the high performance of the deep network, they ignore the difficulty of network training, resulting in the need to spend a huge training set and invest more training skills and time.

第二、大多数SR方法都以前馈的方式直接从LR输入中学习分层特征表示并映射到输出空间，这种单向映射依赖于LR图像中的有限特征。并且很多需要预处理操作的前馈网络只适应于单一的放大倍数，迁移到其他倍数需要繁琐的操作极度缺乏灵活性。Second, most SR methods directly learn hierarchical feature representations from LR inputs and map to the output space in a feed-forward manner, and this one-way mapping relies on limited features in LR images. And many feedforward networks that require preprocessing operations are only suitable for a single magnification factor, and migrating to other multiples requires cumbersome operations and is extremely inflexible.

发明内容Contents of the invention

本发明为了解决现有技术中存在的问题，提供一种基于多尺度反馈网络的图像超分辨率重建方法。其特征在于，包括如下步骤：In order to solve the problems existing in the prior art, the present invention provides an image super-resolution reconstruction method based on a multi-scale feedback network. It is characterized in that, comprising the steps of:

步骤一、利用图像退化模型建立数据集；Step 1, using the image degradation model to establish a data set;

步骤二、构建多尺度反馈网络，所述多尺度反馈网络包括图像特征提取模块、图像特征映射模块和高分辨率图像计算模块；Step 2, building a multi-scale feedback network, the multi-scale feedback network includes an image feature extraction module, an image feature mapping module and a high-resolution image calculation module;

步骤2.1、图像特征提取；Step 2.1, image feature extraction;

将网络输入的LR图像I_LR输入特征提取模块f₀产生初始LR特征图L⁰；The LR image I _LR input by the network is input to the feature extraction module f ₀ to generate the initial LR feature map L ⁰ ;

L⁰＝f₀(I_LR)L ⁰ =f ₀ (I _LR )

令conv(f,n)表示卷积层，f为卷积核大小，n为通道数；上式中f₀由2个卷积层conv(3,n₀)和conv(1,n)组成，n0表示初始低分辨率特征提取层的通道数，n表示特征映射模块中的输入通道数；先利用conv(3,n₀)从输入中产生带有低分辨率图像信息的浅层特征L⁰，再利用conv(1,n)将通道数从n₀降为n；Let conv(f,n) represent the convolutional layer, f is the size of the convolution kernel, and n is the number of channels; in the above formula, f ₀ is composed of two convolutional layers conv(3,n ₀ ) and conv(1,n) , n0 represents the number of channels of the initial low-resolution feature extraction layer, n represents the number of input channels in the feature mapping module; first use conv(3,n ₀ ) to generate shallow features L with low-resolution image information from the input ⁰ , then use conv(1,n) to reduce the number of channels from n ₀ to n;

步骤2.2、图像特征映射；Step 2.2, image feature mapping;

低分辨率特征图L^g-1输入递归反馈模块产生高分辨率特征图H^g：The low-resolution feature map Lg ^-1 is input to the recursive feedback module to generate the high-resolution feature map ^Hg :

其中，G表示多尺度投影组的数量即递归次数；表示在第g次递归中多尺度投影组的特征映射过程。当g等于1，表示将初始特征图L⁰作为第一个多尺度投影组的输入，当g大于1，表示将由前一个多尺度投影组产生的LR特征图L^g-1作为当前输入；Among them, G represents the number of multi-scale projection groups, that is, the number of recursions; Denotes the feature mapping process of the multi-scale projection group in the g-th recursion. When g is equal to 1, it means that the initial feature map L ⁰ is used as the input of the first multi-scale projection group, and when g is greater than 1, it means that the LR feature map L ^g-1 generated by the previous multi-scale projection group is used as the current input;

步骤2.3、计算高分辨率图像；Step 2.3, calculating the high-resolution image;

将多个HR特征图深度级联用下式计算残差图像：Concatenate multiple HR feature maps in depth to calculate the residual image using the following formula:

I^Res＝f_RM([H¹,H²,…,H^g])I ^Res ＝f _RM ([H ¹ ,H ² ,…,H ^g ])

其中，[H¹,H²,…,H^g]表示多个HR特征图的深度级联，f_RM表示conv(3,3)操作操作，I^Res为残差图像。Among them, [H ¹ ,H ² ,…,H ^g ] represents the depth concatenation of multiple HR feature maps, f _RM represents the conv(3,3) operation, and I ^Res is the residual image.

将LR图像插值后得到的图像与残差图像I^Res相加得到重建的高分辨率图像I^SR；Adding the interpolated image of the LR image to the residual image I ^Res to obtain the reconstructed high-resolution image I ^SR ;

I^SR＝I^Res+f_US(I_LR)I ^SR ＝ I ^Res +f _US (I _LR )

其中，f_US表示插值操作。Among them, f _US represents an interpolation operation.

步骤三、训练多尺度反馈网络；Step 3, train the multi-scale feedback network;

步骤四、图像重建。Step 4, image reconstruction.

对上述技术方案的进一步设计为：所述步骤一中利用图像退化模型建立数据集的过程为，给定I_LR表示LR图像，I_HR表示相应的HR图像，将退化过程表示为：The further design of the above-mentioned technical solution is: the process of using the image degradation model to establish a data set in the first step is, given that I _LR represents an LR image, and I _HR represents a corresponding HR image, the degradation process is expressed as:

I_LR＝D(I_HR；δ)I _LR = D(I _HR ; δ)

对从HR图像生成LR图像的退化映射建模，并将退化建模为单个下采样操作：Model the degraded map that generates the LR image from the HR image, and model the degradation as a single downsampling operation:

其中↓_s表示放大倍数s进行下采样操作，δ为比例因子。Among them, ↓ _s represents the magnification factor s for downsampling operation, and δ is the scaling factor.

所述插值算法为双线性插值算法或双三次插值算法。The interpolation algorithm is a bilinear interpolation algorithm or a bicubic interpolation algorithm.

所述步骤三中训练多尺度反馈网络的损失函数为：The loss function of training the multi-scale feedback network in the step 3 is:

其中，x为权值参数和偏置参数的集合，i表示整个训练过程中迭代训练的序列号，m表示训练图像的数量。Among them, x is a set of weight parameters and bias parameters, i represents the sequence number of iterative training in the whole training process, and m represents the number of training images.

本发明的有益效果为：The beneficial effects of the present invention are:

本发明模块化的端到端体系结构不但仅通过较小的参数调整就可以灵活地训练不同深度的网络以及任意扩展到其他放大倍数、极大节约了训练成本，而且可以成功实现更大倍数(8倍)的放大，提高重建图像的峰值信噪比和结构相似性。本方法还能够缓解基于卷积神经网络方法的振铃效应和棋盘伪影的影响，预测更多的高频细节并抑制平滑分量，使得重建图像具有更加清晰锐利的边缘特征，更接近真实的高分辨率图像。The modularized end-to-end architecture of the present invention can flexibly train networks of different depths and arbitrarily expand to other magnifications through small parameter adjustments, greatly saving training costs, and can successfully achieve greater multiples ( 8 times) magnification to improve the peak signal-to-noise ratio and structural similarity of the reconstructed image. This method can also alleviate the ringing effect and checkerboard artifacts based on the convolutional neural network method, predict more high-frequency details and suppress smooth components, so that the reconstructed image has clearer and sharper edge features, which is closer to the real high-frequency image. resolution image.

附图说明Description of drawings

图1为本发明方法的流程图；Fig. 1 is the flowchart of the inventive method;

图2为多尺度反馈网络的结构图；Figure 2 is a structural diagram of a multi-scale feedback network;

图3为网络中的多尺度上投影单元的结构图；Figure 3 is a structural diagram of multi-scale projection units in the network;

图4为网络中的多尺度下投影单元的结构图。Figure 4 is a structural diagram of the multi-scale projection unit in the network.

具体实施方式Detailed ways

下面结合附图以及具体实施例对本发明进行详细说明。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

实施例Example

如图1所示，本实施例的基于多尺度反馈网络的图像超分辨率重建方法，包括如下步骤：As shown in Figure 1, the image super-resolution reconstruction method based on the multi-scale feedback network of the present embodiment includes the following steps:

步骤1，利用图像退化模型建立数据集；Step 1, use the image degradation model to establish a data set;

设I_LR表示LR图像，I_HR表示相应的HR图像，将退化过程表示为：Let I _LR denote the LR image, I _HR denote the corresponding HR image, and express the degradation process as:

I_LR＝D(I_HR；δ) (1)I _LR ＝ D(I _HR ; δ) (1)

本实施例采用具有抗锯齿的双三次插值作为下采样操作，获取DIV2K中的m张训练图像作为训练集。选用Set5、Set14、Urban100、BSD100和Manga109作为标准测试集，并使用双三次插值算法分别进行2倍，3倍、4倍和8倍的下采样。In this embodiment, bicubic interpolation with anti-aliasing is used as a downsampling operation, and m training images in DIV2K are obtained as a training set. Select Set5, Set14, Urban100, BSD100 and Manga109 as the standard test set, and use the bicubic interpolation algorithm to perform 2 times, 3 times, 4 times and 8 times downsampling respectively.

步骤2，构建多尺度反馈网络；网络结构如图2所示，包括如下步骤：Step 2, build a multi-scale feedback network; the network structure is shown in Figure 2, including the following steps:

步骤2.1，图像特征提取；Step 2.1, image feature extraction;

将初始LR图像I_LR输入特征提取模块f₀产生初始LR特征图L⁰：Input the initial LR image I _LR into the feature extraction module f ₀ to generate the initial LR feature map L ⁰ :

L⁰＝f₀(I_LR) (3)L ⁰ ＝f ₀ (I _LR ) (3)

令conv(f,n)表示卷积层，f为卷积核大小，n为通道数。其中，f₀由2个卷积层conv(3,n₀)和conv(1,n)组成，n0表示初始LR图像特征提取层的通道数，n表示特征映射模块中的输入通道数。先利用conv(3,n₀)从输入中产生带有LR图像信息的浅层特征L⁰，再利用conv(1,n)将通道数从n₀降为n。Let conv(f,n) denote the convolution layer, f is the convolution kernel size, and n is the number of channels. Among them, f ₀ consists of 2 convolutional layers conv(3,n ₀ ) and conv(1,n), n0 represents the number of channels of the initial LR image feature extraction layer, and n represents the number of input channels in the feature mapping module. First use conv(3,n ₀ ) to generate shallow feature L ⁰ with LR image information from the input, and then use conv(1,n) to reduce the number of channels from n ₀ to n.

步骤2.2，图像特征映射；Step 2.2, image feature mapping;

用多尺度上投影单元和多尺度下投影单元构成组成投影组递归实现低分辨率和高分辨率的特征映射，得到不同深度的高分辨率特征图。低分辨率特征图L^g-1输入递归反馈模块产生高分辨率特征图H^g：The multi-scale upper projection unit and the multi-scale lower projection unit are used to form a projection group to recursively realize low-resolution and high-resolution feature maps, and obtain high-resolution feature maps of different depths. The low-resolution feature map Lg ^-1 is input to the recursive feedback module to generate the high-resolution feature map ^Hg :

其中，G表示多尺度投影组的数量即递归次数；表示在第g次递归中多尺度投影组的特征映射过程。当g等于1，表示将初始特征图L⁰作为第一个多尺度投影组的输入，当g大于1，表示将由前一个多尺度投影组产生的LR特征图L^g-1作为当前输入。Among them, G represents the number of multi-scale projection groups, that is, the number of recursions; Denotes the feature mapping process of the multi-scale projection group in the g-th recursion. When g is equal to 1, it means that the initial feature map L ⁰ is used as the input of the first multi-scale projection group, and when g is greater than 1, it means that the LR feature map L ^g-1 generated by the previous multi-scale projection group is used as the current input.

操作包括把LR特征映射为HR特征和把HR特征映射为LR特征两个操作，结构如图3和图4所示。 The operation includes mapping LR features to HR features and mapping HR features to LR features. The structures are shown in Figure 3 and Figure 4.

多尺度上投影单元通过以下六个步骤将LR特征映射为HR特征(结构如图3所示)：The multi-scale projection unit maps LR features to HR features through the following six steps (the structure is shown in Figure 3):

(1)：将前一次循环计算的LR特征图L^g-1作为输入，分别使用核大小不同的反卷积和/>在两条支路上执行上采样操作，得到两个HR特征图/>和/> (1): The LR feature map L ^g-1 calculated in the previous cycle is used as input, and deconvolution with different kernel sizes is used respectively and /> Perform an upsampling operation on the two branches to obtain two HR feature maps /> and />

和/>分别表示Deconv1(k₁,n)和Deconv2(k₂,n)，k₁和k₂表示反卷积核的大小，n表示通道数。 and /> Represents Deconv1(k ₁ ,n) and Deconv2(k ₂ ,n), respectively, k ₁ and k ₂ represent the size of the deconvolution kernel, and n represents the number of channels.

(2)：将HR特征图和/>级联，分别使用核大小不同的卷积/>和/>在两条支路上执行下采样操作并生成两个LR特征图/>和/> (2): The HR feature map and /> Concatenation, using convolutions with different kernel sizes /> and /> Perform downsampling operations on both branches and generate two LR feature maps /> and />

和/>分别表示Conv1(k₁,2n)和Conv2(k₂,2n)，每条支路的通道数由n变成2n。 and /> Denote Conv1(k ₁ ,2n) and Conv2(k ₂ ,2n) respectively, and the number of channels of each branch changes from n to 2n.

(3)：将LR特征图和/>级联，使用1×1卷积进行池化和降维操作，/>和/>映射为一个LR特征图/> (3): The LR feature map and /> Cascading, using 1×1 convolution for pooling and dimensionality reduction, /> and /> Mapped to an LR feature map />

C_u表示Conv(1,n)，每条支路的通道数由2n变成n。并且所有的1×1卷积在前一层的学习表示上添加了非线性激励。C _u represents Conv(1,n), and the number of channels of each branch changes from 2n to n. And all 1×1 convolutions add non-linear excitation on the learned representation of the previous layer.

(4)：计算输入的LR特征图L^g-1和重建的LR特征图之间的残差/> (4): Calculate the input LR feature map L ^g-1 and the reconstructed LR feature map Residuals between />

(5)：利用不同核大小的反卷积和/>分别对残差/>进行上采样操作，LR特征中的残差被映射到HR特征中，从而生成新的HR残差特征/>和/> (5): Deconvolution using different kernel sizes and /> Residuals /> Perform an upsampling operation, and the residual in the LR feature is mapped to the HR feature to generate a new HR residual feature/> and />

和/>分别表示反卷积层Deconv1(k₁,n)和Deconv2(k₂,n)，每条支路的通道数依然为n。 and /> represent the deconvolution layers Deconv1(k ₁ ,n) and Deconv2(k ₂ ,n) respectively, and the number of channels of each branch is still n.

(6)：将残差HR特征和/>级联，并与步骤(2)中级联的HR特征叠加，通过1×1卷积输出上投影单元最终的HR特征图H^g。(6): The residual HR feature and /> Concatenated, and superimposed with the HR features concatenated in step (2), the final HR feature map H ^g of the upper projection unit is output through 1×1 convolution.

C_h表示Conv(1,n)，相加后总通道数为2n，通过Conv(1,n)将输出通道数降低为n，与输入通道数保持一致。C _h represents Conv(1,n), the total number of channels after addition is 2n, and the number of output channels is reduced to n by Conv(1,n), which is consistent with the number of input channels.

多尺度下投影单元通过以下六个步骤将HR特征映射为LR特征(结构如图4所示)：The multi-scale projection unit maps HR features to LR features through the following six steps (the structure is shown in Figure 4):

步骤(1)：将前一次循环的多尺度上投影单元输出的HR特征图H^g作为输入，分别使用核大小不同的卷积和/>在两条支路上执行下采样操作，得到两个LR特征图/>和 Step (1): The HR feature map H ^g output by the multi-scale projection unit of the previous cycle is used as input, and convolutions with different kernel sizes are used respectively. and /> Perform the downsampling operation on the two branches to obtain two LR feature maps /> and

和/>分别表示Conv1(k₁,n)和Conv2(k₂,n)。 and /> Denote Conv1(k ₁ ,n) and Conv2(k ₂ ,n) respectively.

步骤(2)：将LR特征图和/>级联，分别使用核大小不同的反卷积/>和/>在两条支路上执行上采样操作并生成两个HR特征图/>和/> Step (2): The LR feature map and /> Cascade, using deconvolutions with different kernel sizes respectively /> and /> Perform an upsampling operation on both branches and generate two HR feature maps /> and />

和/>分别表示Deconv1(k₁,2n)和Deconv2(k₂,2n)，每条支路的通道数由n变成2n。 and /> represent Deconv1(k ₁ ,2n) and Deconv2(k ₂ ,2n) respectively, and the number of channels of each branch is changed from n to 2n.

步骤(3)：将HR特征图和/>级联，并通过1×1卷积获得HR特征图/> Step (3): The HR feature map and /> Concatenate, and get HR feature map by 1×1 convolution />

C_d表示Conv(1,n)，每条支路的通道数由2n变成n。C _d represents Conv(1,n), and the number of channels of each branch changes from 2n to n.

步骤(4)：计算输入的HR特征图H^g和重建的HR特征图之间的残差/> Step (4): Calculate the input HR feature map ^Hg and the reconstructed HR feature map Residuals between />

步骤(5)：利用不同核大小的卷积和/>分别对残差/>进行下采样操作，HR特征中的残差被映射到LR特征中，从而生成新的LR残差特征/>和/> Step (5): Convolution with different kernel sizes and /> Residuals /> The downsampling operation is performed, and the residual in the HR feature is mapped to the LR feature to generate a new LR residual feature/> and />

和/>分别表示卷积层Conv1(k₁,n)和Conv2(k₂,n)，每条支路的通道数依然为n。 and /> represent the convolutional layers Conv1(k ₁ ,n) and Conv2(k ₂ ,n) respectively, and the number of channels of each branch is still n.

步骤(6)：将残差LR特征和/>级联，并与步骤2中级联的LR特征叠加，通过1×1卷积输出下投影单元最终的LR特征图L^g。Step (6): The residual LR feature and /> Concatenate and superimpose with the LR features concatenated in step 2, and output the final LR feature map L ^g of the lower projection unit through 1×1 convolution.

C_l表示Conv(1,n)，相加后总通道数为2n，通过Conv(1,n)将输出通道数减少为n，与输入通道数保持一致。C _l represents Conv(1,n), the total number of channels after addition is 2n, and the number of output channels is reduced to n by Conv(1,n), which is consistent with the number of input channels.

步骤2.3，计算高分辨率图像；Step 2.3, calculating the high-resolution image;

将多个高分辨率特征图深度级联用下式计算残差图像；Multiple high-resolution feature map depth cascades are used to calculate the residual image using the following formula;

I^Res＝f_RM([H¹,H²,…,H^g]) (23)I ^Res =f _RM ([H ¹ ,H ² ,…,H ^g ]) (23)

其中，[H¹,H²,…,H^g]表示多个HR特征图的深度级联，f_RM表示conv(3,3)操作，将级联的一系列HR特征图输入conv(3,3)生成残差图像I^Res。Among them, [H ¹ ,H ² ,…,H ^g ] represents the deep concatenation of multiple HR feature maps, f _RM represents the conv(3,3) operation, and a series of concatenated HR feature maps are input into conv(3, 3) Generate a residual image I ^Res .

将低分辨率图像插值后得到的图像与残差图像I^Res相加生成重建的高分辨率图像I^SR：Add the interpolated low-resolution image to the residual image I ^Res to generate the reconstructed high-resolution image I ^SR :

I^SR＝I^Res+f_US(I_LR) (24)I ^SR ＝ I ^Res +f _US (I _LR ) (24)

其中，f_US表示插值上采样操作，用双线性插值算法，也可以使用双三次插值算法或其他插值算法。Wherein, f _US represents an interpolation up-sampling operation, and a bilinear interpolation algorithm may be used, or a bicubic interpolation algorithm or other interpolation algorithms may be used.

步骤3，训练多尺度反馈网络；Step 3, train the multi-scale feedback network;

将网络的批设置为16，并采用旋转和翻转进行数据增强。根据放大系数输入不同大小的LR图像和对应的HR图像。使用Adam优化网络参数，动量因子为0.9，权重衰减0.0001。将学习率初始值设为0.0001，并且每迭代200次，学习率衰减为原来的一半。Set the batch of the network to 16, and employ rotation and flipping for data augmentation. Input LR images and corresponding HR images of different sizes according to the magnification factor. Use Adam to optimize the network parameters, the momentum factor is 0.9, and the weight decay is 0.0001. The initial value of the learning rate is set to 0.0001, and every 200 iterations, the learning rate decays to half of the original value.

在多尺度投影单元的每个分支中设计不同的核大小和填充并根据相应的放大倍数调整核的大小和步长。输入和输出都使用彩色图像的RGB通道。除网络末端的重建层外，PReLU被用作所有卷积和反卷积层后面的激活函数。用步骤1的图像数据集按步骤2的过程训练网络直至代价损失减少到设定值且训练达到迭代最大次数。利用L₁函数作为损失函数，表达式如下：Different kernel sizes and paddings are designed in each branch of the multi-scale projection unit and the kernel size and step size are adjusted according to the corresponding magnification. Both input and output use the RGB channels of the color image. PReLU is used as the activation function after all convolutional and deconvolutional layers except the reconstruction layer at the end of the network. Use the image data set of step 1 to train the network according to the process of step 2 until the cost loss is reduced to the set value and the training reaches the maximum number of iterations. Using the _L1 function as the loss function, the expression is as follows:

其中x为权值参数和偏置参数的集合，i表示整个训练过程中迭代训练的序列号，m表示训练图像的数量。Where x is a set of weight parameters and bias parameters, i represents the sequence number of iterative training during the entire training process, and m represents the number of training images.

步骤4，图像重建；Step 4, image reconstruction;

将待处理的低分辨率图像输入到训练好的网络中得到输出的高分辨率图像。Input the low-resolution image to be processed into the trained network to obtain the output high-resolution image.

将峰值信噪比和结构相似性作为评价指标在Set5、Set14、Urban100、BSD100和Manga109这5个标准测试集中评估模型性能，并且所有测试均选用y通道。The peak signal-to-noise ratio and structural similarity are used as evaluation indicators to evaluate the performance of the model in five standard test sets of Set5, Set14, Urban100, BSD100 and Manga109, and all tests use the y channel.

为了验证本方法的有效性与可靠性，在不同的放大倍数上与现有的多个重建方法进行比较。在低倍放大中(×2，×3，×4)，将本方法与目前现有21种先进方法进行比较。由于许多模型不适用于高倍放大(×8)，因此将本方法与12种先进方法比较。对于×2放大，本方法在五个基准数据集中获得最佳的峰值信噪比结果。但是，对于×3，×4和×8的放大，本方法的峰值信噪比和结构相似性优于所有其他模型。随着放大系数的增大，优势相对更加明显，特别是对于×8，证明了本方法处理高倍放大的有效性。在这五个数据集中，本方法在峰值信噪比和结构相似性方面具有更高的客观评估指标。证明了本方法不仅倾向于构造规则的人工图案，而且擅长重构不规则的自然图案。本方法在适应各种场景特征方面具有优势，并且对于具有不同特征的图像具有惊人的超分辨率重建结果。In order to verify the effectiveness and reliability of this method, it is compared with several existing reconstruction methods at different magnifications. At low magnifications (×2,×3,×4), the present method is compared with 21 existing state-of-the-art methods. Since many models are not suitable for high magnification (×8), the present method was compared with 12 advanced methods. For ×2 upscaling, our method achieves the best PSNR results in five benchmark datasets. However, for the scale-ups of ×3, ×4 and ×8, the peak signal-to-noise ratio and structural similarity of our method outperform all other models. As the magnification factor increases, the advantages are relatively more obvious, especially for ×8, which proves the effectiveness of this method in dealing with high magnification. In these five datasets, our method has higher objective evaluation metrics in terms of peak signal-to-noise ratio and structural similarity. It is proved that our method not only tends to construct regular artificial patterns, but also is good at reconstructing irregular natural patterns. Our method has advantages in adapting to various scene features and has amazing super-resolution reconstruction results for images with different features.

本实施例设计的多尺度反馈网络方法仅使用了来自DIV2K的m(m＝800)张训练图像，通过相对很小的训练集却仍可以在8倍放大中获得比其他现有方法优越的重建性能。将多尺度卷积结合反馈机制使得本方法不仅可以在多个上下文尺度上学习丰富的分层特征表示，捕获不同尺度的图像特征，而且可以利用高层特征细化低层表示，更好地表征HR和LR图像间的相互关系。除了结合高层信息和低层信息，还通过全局残差学习和局部残差反馈融合将局部信息和全局信息相结合从而更好地改善重建图像的质量。此外，模块化的端到端体系结构使得本方法仅通过较小的参数调整就可以训练灵活地训练不同深度的网络以及任意扩展到其他放大倍数。本方法可以有效缓解振铃效应和棋盘伪影的影响，相比于目前很多先进的方法具有优异的重建性能，尤其是在很多方法不擅长的高倍放大中这种优势更加明显。The multi-scale feedback network method designed in this embodiment only uses m (m=800) training images from DIV2K, and can still obtain reconstruction superior to other existing methods in 8 times magnification through a relatively small training set performance. Combining multi-scale convolution with a feedback mechanism enables this method to not only learn rich hierarchical feature representations at multiple context scales and capture image features at different scales, but also use high-level features to refine low-level representations to better represent HR and Interrelationships between LR images. In addition to combining high-level information and low-level information, local information and global information are also combined through global residual learning and local residual feedback fusion to better improve the quality of reconstructed images. Moreover, the modular end-to-end architecture enables our method to flexibly train networks of different depths and arbitrarily expand to other magnifications with only minor parameter adjustments. This method can effectively alleviate the influence of ringing effect and checkerboard artifacts. Compared with many current advanced methods, it has excellent reconstruction performance, especially in high magnification where many methods are not good at.

以上所述仅是本发明的优选实施方式，应当指出，对于本技术领域的普通技术人员来说，在不脱离本发明技术原理的前提下，还可以做出若干改进和变形，这些改进和变形也应视为本发明的保护范围，包括但不限于用本方法以及其改进和变形方法用于其它图像处理方面，如图像分类、检测、去噪、增强等。The above is only a preferred embodiment of the present invention, it should be pointed out that for those of ordinary skill in the art, without departing from the technical principle of the present invention, some improvements and modifications can also be made. It should also be regarded as the scope of protection of the present invention, including but not limited to using this method and its improvement and deformation methods for other image processing aspects, such as image classification, detection, denoising, enhancement, etc.

Claims

1. The image super-resolution reconstruction method based on the multi-scale feedback network is characterized by comprising the following steps of:

step one, an image data set is established by utilizing an image degradation model;

step two, constructing a multi-scale feedback network, wherein the multi-scale feedback network comprises an image feature extraction module, an image feature mapping module and a high-resolution image calculation module;

step 2.1, extracting image features;

inputting a network into a low resolution image I _LR Input feature extraction module f ₀ Generating an initial low resolution feature map L ⁰ ；

L ⁰ ＝ ₀ (I _LR )

Let conv (f, n) represent the convolution layer, f is the convolution kernel size, n is the channel number; f in the above ₀ Consists of 2 convolutional layers conv (3, n ₀ ) And conv (1, n), n ₀ The number of channels representing the initial low resolution feature extraction layer, n representing the number of input channels in the feature mapping module; firstly using conv (3, n) ₀ ) Generating shallow features L with low resolution image information from an input ⁰ Then the conv (1, n) is used to change the channel number from n ₀ Reducing to n;

step 2.2, mapping image features;

the multi-scale upper projection unit and the multi-scale lower projection unit form a projection group to recursively realize low-resolution and high-resolution feature mapping, so as to obtain high-resolution feature graphs with different depths; low resolution feature map L ^g-1 Input recursive feedback module for generating high resolution feature map H ^g ：

Wherein G represents the number of multi-scale projection groups, i.e., the number of recursions;representing a feature mapping process of the multi-scale projection group in the g-th recursion; when g is equal to 1, the initial characteristic diagram L is represented ⁰ As input to the first multiscale projection group, when g is greater than 1, the LR signature L to be generated by the previous multiscale projection group is represented ^g-1 As a current input; />The operations include two operations of mapping LR features to HR features and mapping HR features to LR features;

the mapping of LR features to HR features proceeds as follows:

(1): LR profile L calculated from previous cycle ^g-1 As input, deconvolution with different kernel sizes is used, respectivelyAndup-sampling is performed on both branches to obtain two HR profile +.>And->

And->Respectively Deconv1 (k) ₁ N) and Deconv2 (k) ₂ ,n)，k ₁ And k ₂ Indicating the size of the deconvolution kernel, n indicating the number of channels;

(2): HR feature mapAnd->Concatenating, using convolutions of different kernel sizes, respectively>And->Performing downsampling operations on both branches and generating two LR profiles>And->

And->Respectively Conv1 (k) ₁ 2 n) and Conv2 (k) ₂ 2 n), the number of channels of each branch is changed from n to 2n;

(3): will LR characteristic diagramAnd->Cascade, pooling and dimension reduction operations using 1 x 1 convolution, +.>And->Mapping to an LR profile +.>

C _u Representing Conv (1, n), the general state of each branchThe number of tracks is changed from 2n to n; and all 1 x 1 convolutions add nonlinear excitation to the learning representation of the previous layer;

(4): calculate the LR feature map L of the input ^g-1 And reconstructed LR feature mapsResidual error between->

(5): deconvolution with different kernel sizesAnd->Respectively->Performing up-sampling operation, wherein residual error in LR characteristic is mapped into HR characteristic, thereby generating new HR residual error characteristic +.>And->

And->Respectively represent deconvolution layer Deconv1 (k) ₁ N) and Deconv2 (k) ₂ N), the number of channels of each branch is still n;

(6): residual HR featureAnd->Cascading and overlapping with the HR features cascaded in the step (2), outputting a final HR feature map H of the upper projection unit through 1X 1 convolution ^g ；

Ch represents Conv (1, n), the total channel number after addition is 2n, the output channel number is reduced to n by Conv (1, n), and the output channel number is consistent with the input channel number;

the mapping HR features to LR features proceeds as follows:

(1): HR characteristic diagram H outputted by the projection unit on multiple scales of previous cycle ^g As input, convolutions of different kernel sizes are used, respectivelyAnd->Performing downsampling operations on the two branches to obtain two LR profiles +.>And->

And->Respectively Conv1 (k) ₁ N) and Conv2 (k) ₂ ,n)；

(2): will LR characteristic diagramAnd->Cascade, respectively using deconvolution +.>And->Performing up-sampling operations on both branches and generating two HR profile +.>And->

And->Respectively Deconv1 (k) ₁ 2 n) and Deconv2 (k) ₂ 2 n), the number of channels of each branch is changed from n to 2n;

step (3): HR feature mapAnd->Cascading and obtaining HR profile +.1 by 1X 1 convolution>

C _d Representing Conv (1, n), the number of channels per branch is changed from 2n to n;

step (4): computing an input HR feature map H ^g And reconstructed HR feature mapsResidual error between->

Step (5): convolution with different kernel sizesAnd->Respectively->Downsampling, wherein residuals in the HR feature are mapped into the LR feature to generate a new LR residual feature +.>And->

And->Respectively are provided withRepresenting the convolutional layer Conv1 (k) ₁ N) and Conv2 (k) ₂ N), the number of channels of each branch is still n;

(6): residual LR characterizationAnd->Cascading and overlapping with the LR characteristic cascaded in the step (2), outputting a final LR characteristic diagram L of the lower projection unit through 1X 1 convolution ^g ；

C _l Conv (1, n) is represented, the total channel number after addition is 2n, and the output channel number is reduced to n through Conv (1, n) and is consistent with the input channel number;

step 2.3, calculating a high-resolution image;

the depth cascade of a plurality of high-resolution feature images is used for calculating residual images;

l ^Res ＝f _RM ([H ¹ ,H ² ,…,H ^g ])

wherein, [ H ] ¹ ,H ² ,…,H ^g ]Depth concatenation representing multiple high resolution feature maps, f _RM Representing conv (3, 3) operation, I ^Res Is a residual image;

image obtained by interpolating low resolution image and residual image I ^Res Adding to generate a reconstructed high resolution image I ^SR ；

I ^SR ＝I ^Res +f _US (I _LR )

Wherein f _US Representing an interpolation operation;

training a multi-scale feedback network;

step four, reconstructing an image;

and inputting the low-resolution image to be processed into a trained network to obtain an output high-resolution image.

2. The image super-resolution reconstruction method based on a multi-scale feedback network according to claim 1, wherein: the process of establishing the data set by using the image degradation model in the first step is that,

given I _LR Representing low resolution images, I _HR Representing the corresponding high resolution image, the degradation process is represented as:

I _LR ＝D(I _HR ；δ)

modeling a degradation map that generates a low resolution image from a high resolution image and modeling degradation as a single downsampling operation:

wherein ∈ _s Indicating that the amplification factor s is downsampled and delta is a scale factor.

3. The image super-resolution reconstruction method based on a multi-scale feedback network according to claim 1, wherein: the interpolation algorithm is a bilinear interpolation algorithm or a bicubic interpolation algorithm.

4. The image super-resolution reconstruction method based on a multi-scale feedback network according to claim 1, wherein: the loss function of the training multi-scale feedback network in the third step is as follows:

wherein x is a set of weight parameters and bias parameters, i represents a serial number of iterative training in the whole training process, and m represents the number of training images.