Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Cross-Receptive Focused Inference Network for Lightweight Image Super-Resolution

Published: 02 May 2023 Publication History

Abstract

Recently, Transformer-based methods have shown impressive performance in single image super-resolution (SISR) tasks due to the ability of global feature extraction. However, the capabilities of Transformers that need to incorporate contextual information to extract features dynamically are neglected. To address this issue, we propose a lightweight Cross-receptive Focused Inference Network (CFIN) that consists of a cascade of CT Blocks mixed with CNN and Transformer. Specifically, in the CT block, we first propose a CNN-based Cross-Scale Information Aggregation Module (CIAM) to enable the model to better focus on potentially helpful information to improve the efficiency of the Transformer phase. Then, we design a novel Cross-receptive Field Guided Transformer (CFGT) to enable the selection of contextual information required for reconstruction by using a modulated convolutional kernel that understands the current semantic information and exploits the information interaction within different self-attention. Extensive experiments have shown that our proposed CFIN can effectively reconstruct images using contextual information, and it can strike a good balance between computational cost and model performance as an efficient model.

References

[1]
K. Jiang et al., “Dual-path deep fusion network for face image hallucination,” IEEE Trans. Neural Netw. Learn. Syst., vol. 33, no. 1, pp. 378–391, Jan. 2022.
[2]
G. Gao et al., “CTCNet: A CNN-transformer cooperation network for face image super-resolution,” IEEE Trans. Image Process., vol. 32, pp. 1978–1991, 2023.
[3]
Q. Li, M. Gong, Y. Yuan, and Q. Wang, “Symmetrical feature propagation network for hyperspectral image super-resolution,” IEEE Trans. Geosci. Remote Sens., vol. 60, 2022, Art. no.
[4]
M. Hu, K. Jiang, Z. Nie, and Z. Wang, “You only align once: Bidirectional interaction for spatial-temporal video super-resolution,” in Proc. 30th ACM Int. Conf. Multimedia, 2022, pp. 847–855.
[5]
G. Li et al., “Transformer-empowered multi-scale contextual matching and aggregation for multi-contrast MRI super-resolution,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit.2022, pp. 20636–20645.
[6]
C. Dong, C. C. Loy, K. He, and X. Tang, “Image super-resolution using deep convolutional networks,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 2, pp. 295–307, Feb. 2016.
[7]
Y. Zhang, Y. Tian, Y. Kong, B. Zhong, and Y. Fu, “Residual dense network for image super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 2472–2481.
[8]
X. Chu, B. Zhang, and R. Xu, “Multi-objective reinforced evolution in mobile neural architecture search,” in Proc. IEEE Eur. Conf. Comput. Vis., 2020, pp. 99–113.
[9]
J. Li, F. Fang, K. Mei, and G. Zhang, “Multi-scale residual network for image super-resolution,” in Proc. IEEE Eur. Conf. Comput. Vis., 2018, pp. 517–532.
[10]
J. Li, F. Fang, J. Li, K. Mei, and G. Zhang, “MDCN: Multi-scale dense cross network for image super-resolution,” IEEE Trans. Circuits Syst. Video Technol., vol. 31, no. 7, pp. 2547–2561, Jul. 2021.
[11]
N. Ahn, B. Kang, and K.-A. Sohn, “Fast, accurate, and lightweight super-resolution with cascading residual network,” in Proc. IEEE Eur. Conf. Comput. Vis., 2018, pp. 252–268.
[12]
S. W. Zamir et al., “Restormer: Efficient transformer for high-resolution image restoration,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 5728–5739.
[13]
R. Yu et al., “Cascade transformers for end-to-end person search,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2022, pp. 7267–7276.
[14]
G. Gao et al., “Lightweight Bimodal network for single-image super-resolution via symmetric CNN and recursive transformer,” in Proc. Int. Joint Conf. Artif. Intell., 2022, pp. 661–669.
[15]
X. Chen, X. Wang, J. Zhou, and C. Dong, “Activating more pixels in image super-resolution transformer,” 2022, arXiv:2205.04437.
[16]
J. Liang et al., “SwinIR: Image restoration using swin transformer,” in Proc. IEEE/CVF Int. Conf. Comput. Vis. Workshops, 2021, pp. 1833–1844.
[17]
Z. Lu et al., “Transformer for single image super-resolution,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. Workshops, 2022, pp. 457–466.
[18]
C. D. Gilbert and W. Li, “Top-down influences on visual processing,” Nature Rev. Neurosci., vol. 14, no. 5, pp. 350–363, 2013.
[19]
X. Jia, B. De Brabandere, T. Tuytelaars, and L. V. Gool, “Dynamic filter networks,” in Proc. 30th Int. Conf. Neural Inf. Process. Syst., 2016, pp. 667–675.
[20]
Y. Chen et al., “Dynamic convolution: Attention over convolution kernels,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2020, pp. 11030–11039.
[21]
X. Lin, L. Ma, W. Liu, and S.-F. Chang, “Context-gated convolution,” in Proc. IEEE Eur. Conf. Comput. Vis., 2020, pp. 701–718.
[22]
Y. Yuan, X. Chen, and J. Wang, “Object-contextual representations for semantic segmentation,” in Proc. IEEE Eur. Conf. Comput. Vis., 2020, pp. 173–190.
[23]
K. Jiang, Z. Wang, P. Yi, and J. Jiang, “Hierarchical dense recursive network for image super-resolution,” Pattern Recognit., vol. 107, 2020, Art. no.
[24]
L. He et al., “Towards fast and accurate real-world depth super-resolution: Benchmark dataset and baseline,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 9229–9238.
[25]
Q. Tang et al., “BridgeNet: A joint learning network of depth map super-resolution and monocular depth estimation,” in Proc. 29th ACM Int. Conf. Multimedia, 2021, pp. 2148–2157.
[26]
Y. Wu et al., “Bridging component learning with degradation modelling for blind image super-resolution,” IEEE Trans. Multimedia, 2022, early access, Oct. 20, 2022.
[27]
F. Li et al., “Learning detail-structure alternative optimization for blind super-resolution,” 2022, arXiv:2212.01624.
[28]
Z. Hui, X. Wang, and X. Gao, “Fast and accurate single image super-resolution via information distillation network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 723–731.
[29]
Z. Hui, X. Gao, Y. Yang, and X. Wang, “Lightweight image super-resolution with information multi-distillation network,” in Proc. 27th ACM Int. Conf. Multimedia, 2019, pp. 2024–2032.
[30]
J. Liu, J. Tang, and G. Wu, “Residual feature distillation network for lightweight image super-resolution,” in Proc. IEEE Eur. Conf. Comput. Vis., 2020, pp. 41–55.
[31]
X. Luo et al., “LatticeNet: Towards lightweight image super-resolution with lattice block,” in Proc. IEEE Eur. Conf. Comput. Vis., 2020, pp. 272–289.
[32]
D. Zhang, C. Li, N. Xie, G. Wang, and J. Shao, “PFFN: Progressive feature fusion network for lightweight image super-resolution,” in Proc. 29th ACM Int. Conf. Multimedia, 2021, pp. 3682–3690.
[33]
G. Gao et al., “Feature distillation interaction weighting network for lightweight image super-resolution,” in Proc. AAAI Conf. Artif. Intell., 2022, pp. 661–669.
[34]
X. Luo et al., “Lattice network for lightweight image restoration,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 45, no. 4, pp. 4826–4842, Apr. 2023.
[35]
J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 7132–7141.
[36]
F. Wu, A. Fan, A. Baevski, Y. N. Dauphin, and M. Auli, “Pay less attention with lightweight and dynamic convolutions,” 2019, arXiv:1901.10430.
[37]
X. Zhu, H. Hu, S. Lin, and J. Dai, “Deformable ConvNets V2: More deformable, better results,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2019, pp. 9308–9316.
[38]
Y. Jo, S. W. Oh, J. Kang, and S. J. Kim, “Deep video super-resolution network using dynamic upsampling filters without explicit motion compensation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 3224–3232.
[39]
Y. Yang, Z. Zhong, T. Shen, and Z. Lin, “Convolutional neural networks with alternately updated clique,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 2413–2422.
[40]
R. Zhang, “Making convolutional networks shift-invariant again,” in Proc. Int. Conf. Mach. Learn., 2019, pp. 7324–7334.
[41]
S. A. Magid et al., “Dynamic high-pass filtering and multi-spectral attention for image super-resolution,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2021, pp. 4288–4297.
[42]
E. Jang, S. Gu, and B. Poole, “Categorical reparameterization with gumbel-softmax,” 2016, arXiv:1611.01144.
[43]
A. Vaswani et al., “Attention is all you need,” in Proc. 31st Int. Conf. Neural Inf. Process. Syst., 2017, pp. 30.1–30.11.
[44]
A. Dosovitskiy et al., “An image is worth 16×16 words: Transformers for image recognition at scale,” 2020, arXiv:2010.11929.
[45]
X. Chen, L.-J. Li, L. Fei-Fei, and A. Gupta, “Iterative visual reasoning beyond convolutions,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 7239–7248.
[46]
K. Li, Y. Zhang, K. Li, Y. Li, and Y. Fu, “Visual semantic reasoning for image-text matching,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 4654–4662.
[47]
P. Wei et al., “Component divide-and-conquer for real-world image super-resolution,” in Proc. IEEE Eur. Conf. Comput. Vis., 2020, pp. 101–117.
[48]
C. Wang, Z. Li, and J. Shi, “Lightweight image super-resolution with adaptive weighted learning network,” 2019, arXiv:1904.02358.
[49]
R. Lan et al., “MADNet: A fast and lightweight network for single-image super resolution,” IEEE Trans. Cybern., vol. 51, no. 3, pp. 1443–1453, Mar. 2021.
[50]
A. Muqeet et al., “Multi-attention based ultra lightweight image super-resolution,” in Proc. IEEE Eur. Conf. Comput. Vis., 2020, pp. 103–118.
[51]
W. Li et al., “LAPAR: Linearly-assembled pixel-adaptive regression network for single image super-resolution and beyond,” in Proc. 34th Int. Conf. Neural Inf. Process. Syst., 2020, pp. 20343–20355.
[52]
L. Wang et al., “Exploring sparsity in image super-resolution for efficient inference,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit., 2021, pp. 4917–4926.
[53]
K. Park, J. W. Soh, and N. I. Cho, “A dynamic residual self-attention network for lightweight single image super-resolution,” IEEE Trans. Multimedia, vol. 25, pp. 907–918, 2023.
[54]
Z. Du et al., “Fast and memory-efficient network towards efficient image super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2022, pp. 853–862.
[55]
R. Timofte, E. Agustsson, L. Van Gool, M.-H. Yang, and L. Zhang, “Ntire 2017 challenge on single image super-resolution: Methods and results,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2017, pp. 114–125.
[56]
M. Bevilacqua, A. Roumy, C. Guillemot, and M. L. Alberi-Morel, “Low-complexity single-image super-resolution based on nonnegative neighbor embedding,” in Proc. Brit. Mach. Vis. Conf., 2012, pp. 135.1–135.10.
[57]
R. Zeyde, M. Elad, and M. Protter, “On single image scale-up using sparse-representations,” in Proc. Int. Conf. Curves Surfaces, 2010, pp. 711–730.
[58]
D. Martin, C. Fowlkes, D. Tal, and J. Malik, “A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2001, pp. 416–423.
[59]
J.-B. Huang, A. Singh, and N. Ahuja, “Single image super-resolution from transformed self-exemplars,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2015, pp. 5197–5206.
[60]
Y. Matsui et al., “Sketch-based manga retrieval using manga109 dataset,” Multimedia Tools Appl., vol. 76, no. 20, pp. 21811–21838, 2017.
[61]
T. Salimans and D. P. Kingma, “Weight normalization: A simple reparameterization to accelerate training of deep neural networks,” in Proc. 30th Int. Conf. Neural Inf. Process. Syst., 2016, pp. 901–909.
[62]
J. Kim, J. K. Lee, and K. M. Lee, “Accurate image super-resolution using very deep convolutional networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 1646–1654.
[63]
C. Ledig et al., “Photo-realistic single image super-resolution using a generative adversarial network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 4681–4690.
[64]
R. Timofte, R. Rothe, and L. Van Gool, “Seven ways to improve example-based single image super resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 1865–1873.
[65]
B. Lim, S. Son, H. Kim, S. Nah, and K. Mu Lee, “Enhanced deep residual networks for single image super-resolution,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2017, pp. 136–144.
[66]
Y. Zhang et al., “Image super-resolution using very deep residual channel attention networks,” in Proc. IEEE Eur. Conf. Comput. Vis., 2018, pp. 286–301.
[67]
J. Cai, H. Zeng, H. Yong, Z. Cao, and L. Zhang, “Toward real-world single image super-resolution: A new benchmark and a new model,” in Proc. IEEE/CVF Int. Conf. Comput. Vis., 2019, pp. 3086–3095.
[68]
H. Zhao, X. Kong, J. He, Y. Qiao, and C. Dong, “Efficient image super-resolution using pixel attention,” in Proc. IEEE Eur. Conf. Comput. Vis. Workshops, 2020, pp. 56–72.

Cited By

View all
  • (2024)A Systematic Survey of Deep Learning-Based Single-Image Super-ResolutionACM Computing Surveys10.1145/365910056:10(1-40)Online publication date: 13-Apr-2024
  • (2024)Efficient Hybrid Feature Interaction Network for Stereo Image Super-ResolutionIEEE Transactions on Multimedia10.1109/TMM.2024.340562626(10094-10105)Online publication date: 1-Jan-2024
  • (2024)Activating More Information in Arbitrary-Scale Image Super-ResolutionIEEE Transactions on Multimedia10.1109/TMM.2024.337325726(7946-7961)Online publication date: 8-Mar-2024

Index Terms

  1. Cross-Receptive Focused Inference Network for Lightweight Image Super-Resolution
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image IEEE Transactions on Multimedia
      IEEE Transactions on Multimedia  Volume 26, Issue
      2024
      10405 pages

      Publisher

      IEEE Press

      Publication History

      Published: 02 May 2023

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 19 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A Systematic Survey of Deep Learning-Based Single-Image Super-ResolutionACM Computing Surveys10.1145/365910056:10(1-40)Online publication date: 13-Apr-2024
      • (2024)Efficient Hybrid Feature Interaction Network for Stereo Image Super-ResolutionIEEE Transactions on Multimedia10.1109/TMM.2024.340562626(10094-10105)Online publication date: 1-Jan-2024
      • (2024)Activating More Information in Arbitrary-Scale Image Super-ResolutionIEEE Transactions on Multimedia10.1109/TMM.2024.337325726(7946-7961)Online publication date: 8-Mar-2024

      View Options

      View options

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media