research-article

Kernel Dimension Matters: To Activate Available Kernels for Real-time Video Super-Resolution

Authors:

Yao ZhaoAuthors Info & Claims

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

Pages 8617 - 8625

https://doi.org/10.1145/3581783.3611908

Published: 27 October 2023 Publication History

Abstract

Real-time video super-resolution requires low latency with high-quality reconstruction. Existing methods mostly use pruning schemes or neglect complicated modules to reduce the calculation complexity. However, the video contains large amounts of temporal redundancies due to the inter-frame correlation, which is rarely investigated in existing methods. The static and dynamic information lies in feature maps and represents the redundant complements and temporal offsets respectively. It is crucial to split channels with dynamic and static information for efficient processing. Thus, this paper proposes a kernel-split strategy to activate available kernels for real-time inference. This strategy focuses on the dimensions of convolutional kernels, including the channel and depth dimensions. Available kernel dimensions are activated according to the split of high-value and low-value channels. Specifically, a multi-channel selection unit is designed to discriminate the importance of channels and filter the high-value channels hierarchically. At each hierarchy, low-dimensional convolutional kernels are activated to reuse the low-value channel and re-parameterized convolutional kernels are employed on the high-value channel to merge the depth dimension. In addition, we design a multiple flow deformable alignment module for a sufficient temporal representation with affordable calculation cost. Experimental results demonstrate that our method outperforms other state-of-the-art (SOTA) ones in terms of reconstruction quality and runtime. Codes will be available at https://github.com/Kimsure/KSNet.

References

[1]

Bahetiyaer Bare, Bo Yan, Chenxi Ma, and Ke Li. 2019. Real-time video super-resolution via motion convolution kernel estimation. Neurocomputing, Vol. 367 (2019), 236--245.

Digital Library

[2]

Jose Caballero, Christian Ledig, Andrew P. Aitken, Alejandro Acosta, Johannes Totz, Zehan Wang, and Wenzhe Shi. 2017. Real-time video super-resolution with spatio-temporal networks and motion compensation. In CVPR. 2848--2857.

[3]

Kelvin CK Chan, Xintao Wang, Ke Yu, Chao Dong, and Chen Change Loy. 2021. BasicVSR: The search for essential components in video super-resolution and beyond. In CVPR. 4947--4956.

[4]

Kelvin CK Chan, Shangchen Zhou, Xiangyu Xu, and Chen Change Loy. 2022. BasicVSR: Improving video super-resolution with enhanced propagation and alignment. In CVPR. 5972--5981.

[5]

Dario Fuoli, Martin Danelljan, Radu Timofte, and Luc Van Gool. 2023. Fast online video super-resolution with deformable attention pyramid. In ACCV. 1735--1744.

[6]

Dario Fuoli, Shuhang Gu, and Radu Timofte. 2019. Efficient video super-resolution through recurrent latent space propagation. In ICCVW. 3476--3485.

[7]

Takashi Isobe, Xu Jia, Shuhang Gu, Songjiang Li, Shengjin Wang, and Qi Tian. 2020a. Video super-resolution with recurrent structure-detail network. In ECCV. 645--660.

[8]

Takashi Isobe, Songjiang Li, Xu Jia, Shanxin Yuan, Gregory Slabaugh, Chunjing Xu, Ya-Li Li, Shengjin Wang, and Qi Tian. 2020b. Video super-resolution with temporal group attention. In CVPR. 8008--8017.

[9]

Takashi Isobe, Fang Zhu, Xu Jia, and Shengjin Wang. 2020c. Revisiting temporal modeling for video super-resolution. arXiv preprint arXiv:2008.05765 (2020).

[10]

Diederik P Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In ICLR.

[11]

Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. 2017. Deep laplacian pyramid networks for fast and accurate super-resolution. In CVPR. 624--632.

[12]

Feng Li, Huihui Bai, and Yao Zhao. 2020. Learning a deep dual attention network for video super-resolution. IEEE TIP, Vol. 29 (2020), 4474--4488.

[13]

Hao Li, Asim Kadav, Igor Durdanovic, Hanan Samet, and Hans Peter Graf. 2017. Pruning filters for efficient convnets. In ICLR.

[14]

Ce Liu and Deqing Sun. 2013. On Bayesian adaptive video super resolution. IEEE TPAMI, Vol. 36, 2 (2013), 346--360.

Digital Library

[15]

Chengxu Liu, Huan Yang, Jianlong Fu, and Xueming Qian. 2022. Learning trajectory-aware Transformer for video super-resolution. In CVPR. 5687--5696.

[16]

Jie Liu, Wenjie Zhang, Yuting Tang, Jie Tang, and Gangshan Wu. 2020. Residual feature aggregation network for image super-resolution. In CVPR. 2359--2368.

[17]

Ilya Loshchilov and Frank Hutter. 2016. SGDR: stochastic gradient descent with warm restarts. arXiv Preprint arXiv:1608.03983 (2016).

[18]

Seungjun Nah, Sungyong Baik, Seokil Hong, Gyeongsik Moon, Sanghyun Son, Radu Timofte, and Kyoung Mu Lee. 2019. NTIRE 2019 challenge on video deblurring and super-resolution: Dataset and study. In CVPRW.

[19]

Anurag Ranjan and Michael J Black. 2017. Optical flow estimation using a spatial pyramid network. In CVPR. 4161--4170.

[20]

Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In CVPR. 1874--1883.

[21]

Xin Tao, Hongyun Gao, Renjie Liao, Jue Wang, and Jiaya Jia. 2017. Detail-revealing deep video super-resolution. In ICCV. 4472--4480.

[22]

Longguang Wang, Yulan Guo, Zaiping Lin, Xinpu Deng, and Wei An. 2019b. Learning for video super-resolution through HR optical flow estimation. In ACCV. 514--529.

[23]

Longguang Wang, Yulan Guo, Li Liu, Zaiping Lin, Xinpu Deng, and Wei An. 2020. Deep video super-resolution using HR optical flow estimation. IEEE TIP, Vol. 29 (2020), 4323--4336.

[24]

Xintao Wang, Kelvin CK Chan, Ke Yu, Chao Dong, and Chen Change Loy. 2019a. EDVR: Video restoration with enhanced deformable convolutional networks. In CVPRW.

[25]

Zhongyuan Wang, Peng Yi, Kui Jiang, Junjun Jiang, Zhen Han, Tao Lu, and Jiayi Ma. 2018. Multi-memory convolutional neural network for video super-resolution. IEEE TIP, Vol. 28, 5 (2018), 2530--2544.

[26]

Bin Xia, Jingwen He, Yulun Zhang, Yucheng Hang, Wenming Yang, and Luc Van Gool. 2022. Residual sparsity connection learning for efficient video super-resolution. arXiv preprint arXiv:2206.07687 (2022).

[27]

Tianfan Xue, Baian Chen, Jiajun Wu, Donglai Wei, and William T Freeman. 2019. Video enhancement with task-oriented flow. IJCV, Vol. 127, 8 (2019), 1106--1125.

Digital Library

[28]

Bo Yan, Chuming Lin, and Weimin Tan. 2019. Frame and feature-context video super-resolution. In AAAI, Vol. 33. 5597--5604.

Digital Library

[29]

Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, Tao Lu, and Jiayi Ma. 2020. A progressive fusion generative adversarial network for realistic and consistent video super-resolution. IEEE TPAMI, Vol. 44, 5 (2020), 2264--2280.

[30]

Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, Tao Lu, Xin Tian, and Jiayi Ma. 2021. Omniscient video super-resolution. In ICCV. 4429--4438.

[31]

Peng Yi, Zhongyuan Wang, Kui Jiang, Junjun Jiang, and Jiayi Ma. 2019. Progressive fusion video super-resolution network via exploiting non-local spatio-temporal correlations. In ICCV. 3106--3115.

[32]

Xinyi Ying, Longguang Wang, Yingqian Wang, Weidong Sheng, Wei An, and Yulan Guo. 2020. Deformable 3D convolution for video super-resolution. IEEE SPL, Vol. 27 (2020), 1500--1504.

[33]

Yubin Zeng, Zhijiao Xiao, Kwok-Wai Hung, and Simon Lui. 2021. Real-time video super resolution network using recurrent multi-branch dilated convolutions. Signal Processing: Image Communication, Vol. 93 (2021), 116167.

[34]

Yulun Zhang, Huan Wang, Can Qin, and Yun Fu. 2021. Aligned structured sparsity learning for efficient image super-resolution. In NeurIPS, Vol. 34. 2695--2706.

Cited By

Lin JTao ZTong XMai XWang HWang BWang YZhao QYu JLin YYan SGao SZhang WCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Suppressing Uncertainties in Degradation Estimation for Blind Super-ResolutionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681439(6374-6383)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681439

Index Terms

Kernel Dimension Matters: To Activate Available Kernels for Real-time Video Super-Resolution
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Reconstruction

Recommendations

Real-time video super-resolution via motion convolution kernel estimation
Abstract
The goal of video super-resolution (SR) is to generate a high-resolution (HR) video frame from multiple consecutive low-resolution (LR) frames. This task is challenging because it considers not only the spatial relationship but also ...
Frame Selection Using Spatiotemporal Dynamics and Key Features as Input Pre-processing for Video Super-Resolution Models
Abstract
This paper presents a novel approach to video super-resolution (VSR) by focusing on the selection of input frames, a process critical to VSR. VSR methods typically rely on deep learning techniques, those that are able to learn features from a ...
Deeply feature fused video super-resolution network using temporal grouping
Abstract
The video super-resolution (VSR) task refers to the use of corresponding low-resolution frames and multiple neighboring frames to generate high-resolution (HR) frames. An important step in VSR is to fuse the features of the reference frame with ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '23: Proceedings of the 31st ACM International Conference on Multimedia

October 2023

9913 pages

ISBN:9798400701085

DOI:10.1145/3581783

General Chairs:
Abdulmotaleb El Saddik
University of Ottawa, Canada & MBZUAI, UAE
,
Tao Mei
HiDream.ai, China
,
Rita Cucchiara
University of Modena and Reggio Emilia, Italy
,
Program Chairs:
Marco Bertini
University of Florence, Italy
,
Diana Patricia Tobon Vallejo
Unversidad de Medellin, Colombia
,
Pradeep K. Atrey
University at Albany, State University of New York, USA
,
M. Shamim Hossain
M. Shamim Hossain (King Saud University, KSA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

MM '23

Sponsor:

SIGMM

MM '23: The 31st ACM International Conference on Multimedia

October 29 - November 3, 2023

Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
149
Total Downloads

Downloads (Last 12 months)135
Downloads (Last 6 weeks)10

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lin JTao ZTong XMai XWang HWang BWang YZhao QYu JLin YYan SGao SZhang WCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Suppressing Uncertainties in Degradation Estimation for Blind Super-ResolutionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681439(6374-6383)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681439

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents