Article

Efficient Image Super-Resolution Using Vast-Receptive-Field Attention

Authors:

Chao DongAuthors Info & Claims

Computer Vision – ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part II

Pages 256 - 272

https://doi.org/10.1007/978-3-031-25063-7_16

Published: 16 February 2023 Publication History

Abstract

The attention mechanism plays a pivotal role in designing advanced super-resolution (SR) networks. In this work, we design an efficient SR network by improving the attention mechanism. We start from a simple pixel attention module and gradually modify it to achieve better super-resolution performance with reduced parameters. The specific approaches include: (1) increasing the receptive field of the attention branch, (2) replacing large dense convolution kernels with depthwise separable convolutions, and (3) introducing pixel normalization. These approaches paint a clear evolutionary roadmap for the design of attention mechanisms. Based on these observations, we propose VapSR, the Vast-receptive-field Pixel attention network. Experiments demonstrate the superior performance of VapSR. VapSR outperforms the present lightweight networks with even fewer parameters. And the light version of VapSR can use only 21.68% and 28.18% parameters of IMDB and RFDN to achieve similar performances to those networks. The code and models are available at https://github.com/zhoumumu/VapSR.

References

[1]

Agustsson, E., Timofte, R.: NTIRE 2017 challenge on single image super-resolution: Dataset and study. In: CVPRW, pp. 126–135 (2017)

[2]

Ahn, N., Kang, B., Sohn, K.A.: Fast, accurate, and lightweight super-resolution with cascading residual network. In: ECCV, pp. 252–268 (2018)

[3]

Athiwaratkun, B., Finzi, M., Izmailov, P., Wilson, A.G.: There are many consistent explanations of unlabeled data: why you should average. arXiv preprint arXiv:1806.05594 (2018)

[4]

Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)

[5]

Bevilacqua, M., Roumy, A., Guillemot, C., Alberi-Morel, M.L.: Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In: BMVC, pp. 135.1–135.10 (2012)

[6]

Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, and Zagoruyko S Vedaldi A, Bischof H, Brox T, and Frahm J-M End-to-end object detection with transformers Computer Vision – ECCV 2020 2020 Cham Springer 213-229

Digital Library

[7]

Chen, H., et al.: Pre-trained image processing transformer. In: CVPR, pp. 12299–12310 (2021)

[8]

Chen, H., Gu, J., Zhang, Z.: Attention in attention network for image super-resolution. arXiv preprint arXiv:2104.09497 (2021)

[9]

Chen, X., Wang, X., Zhou, J., Dong, C.: Activating more pixels in image super-resolution transformer. arXiv preprint arXiv:2205.04437 (2022)

[10]

Choi, J.S., Kim, M.: A deep convolutional neural network with selection units for super-resolution. In: CVPRW, pp. 154–160 (2017)

[11]

Dai, T., Cai, J., Zhang, Y., Xia, S.T., Zhang, L.: Second-order attention network for single image super-resolution. In: CVPR, pp. 11065–11074 (2019)

[12]

Ding, X., Zhang, X., Han, J., Ding, G.: Scaling up your kernels to 31

\times

31: revisiting large kernel design in CNNs. In: CVPR, pp. 11963–11975 (2022)

[13]

Dong C, Loy CC, He K, and Tang X Fleet D, Pajdla T, Schiele B, and Tuytelaars T Learning a deep convolutional network for image super-resolution Computer Vision – ECCV 2014 2014 Cham Springer 184-199

[14]

Dong C, Loy CC, and Tang X Leibe B, Matas J, Sebe N, and Welling M Accelerating the super-resolution convolutional neural network Computer Vision – ECCV 2016 2016 Cham Springer 391-407

[15]

Dosovitskiy, A., et al.: An image is worth 16

\times

16 words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)

[16]

Guo, M.H., Lu, C.Z., Liu, Z.N., Cheng, M.M., Hu, S.M.: Visual attention network. arXiv preprint arXiv:2202.09741 (2022)

[17]

Hendrycks, D., Gimpel, K.: Gaussian error linear units (GELUs). arXiv preprint arXiv:1606.08415 (2016)

[18]

Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: CVPR, pp. 7132–7141 (2018)

[19]

Huang, J.B., Singh, A., Ahuja, N.: Single image super-resolution from transformed self-exemplars. In: CVPR, pp. 5197–5206 (2015)

[20]

Hui, Z., Gao, X., Yang, Y., Wang, X.: Lightweight image super-resolution with information multi-distillation network. In: ACM Multimedia, pp. 2024–2032 (2019)

[21]

Hui, Z., Wang, X., Gao, X.: Fast and accurate single image super-resolution via information distillation network. In: CVPR, pp. 723–731 (2018)

[22]

Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: ICML, pp. 448–456 (2015)

[23]

Jinjin G, Haoming C, Haoyu C, Xiaoxing Y, Ren JS, and Chao D Vedaldi A, Bischof H, Brox T, and Frahm J-M PIPAL: a large-scale image quality assessment dataset for perceptual image restoration Computer Vision – ECCV 2020 2020 Cham Springer 633-651

Digital Library

[24]

Kim, J., Lee, J.K., Lee, K.M.: Accurate image super-resolution using very deep convolutional networks. In: CVPR, pp. 1646–1654 (2016)

[25]

Kim, J., Lee, J.K., Lee, K.M.: Deeply-recursive convolutional network for image super-resolution. In: CVPR, pp. 1637–1645 (2016)

[26]

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

[27]

Kong, F., et al.: Residual local feature network for efficient super-resolution. In: CVPR, pp. 766–776 (2022)

[28]

Kong, F., et al.: Residual local feature network for efficient super-resolution. In: CVPRW, pp. 766–776 (2022)

[29]

Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep laplacian pyramid networks for fast and accurate super-resolution. In: CVPR, pp. 624–632 (2017)

[30]

Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR, pp. 4681–4690 (2017)

[31]

Li, K., et al.: Uniformer: unifying convolution and self-attention for visual recognition. arXiv preprint arXiv:2201.09450 (2022)

[32]

Li, W., Lu, X., Lu, J., Zhang, X., Jia, J.: On efficient transformer and image pre-training for low-level vision. arXiv preprint arXiv:2112.10175 (2021)

[33]

Li, W., Zhou, K., Qi, L., Jiang, N., Lu, J., Jia, J.: LAPAR: linearly-assembled pixel-adaptive regression network for single image super-resolution and beyond. In: NIPS, vol. 33, pp. 20343–20355 (2020)

[34]

Li, Y., et al.: NTIRE 2022 challenge on efficient super-resolution: methods and results. In: CVPR, pp. 1062–1102 (2022)

[35]

Li, Z., et al.: Blueprint separable residual network for efficient image super-resolution. In: CVPR, pp. 833–843 (2022)

[36]

Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: SwinIR: image restoration using swin transformer. In: ICCV, pp. 1833–1844 (2021)

[37]

Lim, B., Son, S., Kim, H., Nah, S., Mu Lee, K.: Enhanced deep residual networks for single image super-resolution. In: CVPR, pp. 136–144 (2017)

[38]

Liu J, Tang J, and Wu G Bartoli A and Fusiello A Residual feature distillation network for lightweight image super-resolution Computer Vision – ECCV 2020 Workshops 2020 Cham Springer 41-55

Digital Library

[39]

Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. arXiv preprint arXiv:2103.14030 (2021)

[40]

Liu, Z., Mao, H., Wu, C.Y., Feichtenhofer, C., Darrell, T., Xie, S.: A convnet for the 2020s. In: CVPR, pp. 11976–11986 (2022)

[41]

Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: ICCV, vol. 2, pp. 416–423 (2001)

[42]

Niu B et al. Vedaldi A, Bischof H, Brox T, Frahm J-M, et al. Single image super-resolution via a holistic attention network Computer Vision – ECCV 2020 2020 Cham Springer 191-207

Digital Library

[43]

Qian, G., et al.: Rethinking the pipeline of demosaicing, denoising and super-resolution. In: ICCP (2022)

[44]

Shi, S., Gu, J., Xie, L., Wang, X., Yang, Y., Dong, C.: Rethinking alignment in video super-resolution transformers. arXiv preprint arXiv:2207.08494 (2022)

[45]

Tai, Y., Yang, J., Liu, X.: Image super-resolution via deep recursive residual network. In: CVPR, pp. 3147–3155 (2017)

[46]

Tai, Y., Yang, J., Liu, X., Xu, C.: MemNet: a persistent memory network for image restoration. In: ICCV, pp. 4539–4547 (2017)

[47]

Tong, T., Li, G., Liu, X., Gao, Q.: Image super-resolution using dense skip connections. In: ICCV, pp. 4799–4807 (2017)

[48]

Trockman, A., Kolter, J.Z.: Patches are all you need? arXiv preprint arXiv:2201.09792 (2022)

[49]

Ulyanov, D., Vedaldi, A., Lempitsky, V.: Instance normalization: the missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022 (2016)

[50]

Vaswani, A., et al.: Attention is all you need. In: NIPS, vol. 30, pp. 5998–6008 (2017)

[51]

Wang Z, Bovik AC, Sheikh HR, and Simoncelli EP Image quality assessment: from error visibility to structural similarity TIP 2004 13 4 600-612

Digital Library

[52]

Wu, Y., He, K.: Group normalization. In: ECCV, pp. 3–19 (2018)

[53]

Zeyde R, Elad M, Protter M, et al. Boissonnat J-D et al. On single image scale-up using sparse-representations Curves and Surfaces 2012 Heidelberg Springer 711-730

Digital Library

[54]

Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: ECCV, pp. 286–301 (2018)

[55]

Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image super-resolution. In: CVPR, pp. 2472–2481 (2018)

[56]

Zhao H, Kong X, He J, Qiao Yu, and Dong C Bartoli A and Fusiello A Efficient image super-resolution using pixel attention Computer Vision – ECCV 2020 Workshops 2020 Cham Springer 56-72

Digital Library

Cited By

Pan LLi GXu KLv YZhang WLi LLei L(2024)Dual residual and large receptive field network for lightweight image super-resolutionNeurocomputing10.1016/j.neucom.2024.128158600:COnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.neucom.2024.128158
Xu KPan LPeng GZhang WLv YLi GLi LLei L(2024)Multi-scale strip-shaped convolution attention network for lightweight image super-resolutionImage Communication10.1016/j.image.2024.117166128:COnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.image.2024.117166
Zheng MSun LDong JPan J(2024)SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-ResolutionComputer Vision – ECCV 202410.1007/978-3-031-72973-7_21(359-375)Online publication date: 29-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-72973-7_21
Show More Cited By

Recommendations

Single-image super-resolution with multilevel residual attention network
Abstract
Recently, a great variety of image super-resolution (SR) algorithms based on convolutional neural network (CNN) have been proposed and achieved significant improvement. But how to restore more high-frequency details such as edges and textures is ...
Image Super-Resolution Using Deep RCSA Network
Artificial Neural Networks and Machine Learning – ICANN 2022
Abstract
The aim of image super-resolution (SR) is to reconstruct high-resolution images from low-resolution images. As a basic image-processing procedure, SR facilitates subsequent tasks. With the boom in deep learning (DL), DL-based image SR approaches ...
Multi-scale convolutional attention network for lightweight image super-resolution
Abstract
Convolutional neural network (CNN) based methods have recently achieved extraordinary performance in single image super-resolution (SISR) tasks. However, most existing CNN-based approaches increase the model’s depth by stacking massive kernel ...
Highlights
- A lightweight and efficient network for single image super-resolution.
- Exploring the use of large kernel convolutions in lightweight super-resolution.
- Using depth-wise asymmetric convolution to reduce redundant computations.
- ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Computer Vision – ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part II

Oct 2022

788 pages

ISBN:978-3-031-25062-0

DOI:10.1007/978-3-031-25063-7

Editors:
Leonid Karlinsky
IBM Research - MIT-IBM Watson AI Lab, Massachusetts, USA
,
Tomer Michaeli
Technion – Israel Institute of Technology, Haifa, Israel
,
Ko Nishino
Kyoto University, Kyoto, Japan

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 16 February 2023

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Pan LLi GXu KLv YZhang WLi LLei L(2024)Dual residual and large receptive field network for lightweight image super-resolutionNeurocomputing10.1016/j.neucom.2024.128158600:COnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.neucom.2024.128158
Xu KPan LPeng GZhang WLv YLi GLi LLei L(2024)Multi-scale strip-shaped convolution attention network for lightweight image super-resolutionImage Communication10.1016/j.image.2024.117166128:COnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.image.2024.117166
Zheng MSun LDong JPan J(2024)SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-ResolutionComputer Vision – ECCV 202410.1007/978-3-031-72973-7_21(359-375)Online publication date: 29-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-72973-7_21
Zhang RGu JChen HDong CZhang YYang WKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Crafting training degradation distribution for the accuracy-generalization trade-off in real-world super-resolutionProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620129(41078-41091)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3620129

View Options

View options

Media

Figures

Other

Tables

View Table of Contents