research-article

Multi-task Learning-based All-in-one Collaboration Framework for Degraded Image Super-resolution

Authors:

Kazuyuki Tasaka,

Zhibo ChenAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), Volume 17, Issue 1

Article No.: 21, Pages 1 - 21

https://doi.org/10.1145/3417333

Published: 16 April 2021 Publication History

Abstract

In this article, we address the degraded image super-resolution problem in a multi-task learning (MTL) manner. To better share representations between multiple tasks, we propose an all-in-one collaboration framework (ACF) with a learnable “junction” unit to handle two major problems that exist in MTL—“How to share” and “How much to share.” Specifically, ACF consists of a sharing phase and a reconstruction phase. Considering the intrinsic characteristic of multiple image degradations, we propose to first deal with the compression artifact, motion blur, and spatial structure information of the input image in parallel under a three-branch architecture in the sharing phase. Subsequently, in the reconstruction phase, we up-sample the previous features for high-resolution image reconstruction with a channel-wise and spatial attention mechanism. To coordinate two phases, we introduce a learnable “junction” unit with a dual-voting mechanism to selectively filter or preserve shared feature representations that come from sharing phase, learning an optimal combination for the following reconstruction phase. Finally, a curriculum learning-based training scheme is further proposed to improve the convergence of the whole framework. Extensive experimental results on synthetic and real-world low-resolution images show that the proposed all-in-one collaboration framework not only produces favorable high-resolution results while removing serious degradation, but also has high computational efficiency, outperforming state-of-the-art methods. We also have applied ACF to some image-quality sensitive practical task, such as pose estimation, to improve estimation accuracy of low-resolution images.

References

[1]

Abrar H. Abdulnabi, Gang Wang, Jiwen Lu, and Kui Jia. 2015. Multi-task CNN model for attribute prediction. IEEE Trans. Multimedia 17, 11 (2015), 1949–1959.

Digital Library

[2]

Pablo Arbelaez, Michael Maire, Charless Fowlkes, and Jitendra Malik. 2011. Contour detection and hierarchical image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 33, 5 (2011), 898–916.

Digital Library

[3]

Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie Line Alberi-Morel. 2012. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. In Proceedings of the 23rd British Machine Vision Conference (BMVC'12). BMVA Press, 135.1--135.10.

[4]

Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).

Digital Library

[5]

Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa. 2011. Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, Aug. (2011), 2493–2537.

Digital Library

[6]

Chao Dong, Yubin Deng, Chen Change Loy, and Xiaoou Tang. 2015. Compression artifacts reduction by a deep convolutional network. In ICCV. 576–584.

[7]

Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2014. Learning a deep convolutional network for image super-resolution. In ECCV. Springer, 184–199.

[8]

Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016. Accelerating the super-resolution convolutional neural network. In ECCV. Springer, 391–407.

[9]

Ross Girshick. 2015. Fast R-CNN. In ICCV. 1440–1448.

[10]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In NIPS. 2672–2680.

[11]

Yong Guo, Jian Chen, Jingdong Wang, Qi Chen, Jiezhang Cao, Zeshuai Deng, Yanwu Xu, and Mingkui Tan. 2020. Closed-loop matters: Dual regression networks for single image super-resolution. In CVPR. 5407–5416.

[12]

Wei Han, Shiyu Chang, Ding Liu, Mo Yu, Michael Witbrock, and Thomas S. Huang. 2018. Image super-resolution via dual-state recurrent networks. In CVPR.

[13]

Muhammad Haris, Greg Shakhnarovich, and Norimichi Ukita. 2018. Deep backprojection networks for super-resolution. In CVPR.

[14]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In CVPR. 770–778.

[15]

Zewei He, Siliang Tang, Jiangxin Yang, Yanlong Cao, Michael Ying Yang, and Yanpeng Cao. 2018. Cascaded deep networks with multiple receptive fields for infrared image super-resolution. IEEE Trans. Circ. Syst. Vid. Technol. 29, 8 (2018).

[16]

Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In CVPR.

[17]

Zhe Hu, Li Xu, and Ming-Hsuan Yang. 2014. Joint depth estimation and camera shake removal from single blurry image. In CVPR. 2893–2900.

[18]

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q. Weinberger. 2017. Densely connected convolutional networks. In CVPR, Vol. 1. 3.

[19]

Jizhou Huang, Wei Zhang, Yaming Sun, Haifeng Wang, and Ting Liu. 2018. Improving entity recommendation with search log and multi-task learning. In IJCAI. 4107–4114.

[20]

Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In CVPR. 5197–5206.

[21]

Jun-Jie Huang and Wan-Chi Siu. 2017. Learning hierarchical decision trees for single-image super-resolution. IEEE Trans. Circ. Syst. Video Technol. 27, 5 (2017), 937–950.

Digital Library

[22]

Jeremy Jancsary, Sebastian Nowozin, and Carsten Rother. 2012. Loss-specific training of non-parametric image restoration models: A new state of the art. In ECCV. Springer, 112–125.

[23]

Alex Kendall, Yarin Gal, and Roberto Cipolla. 2018. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In CVPR. 3 (2018).

[24]

Faisal Khan, Bilge Mutlu, and Xiaojin Zhu. 2011. How do humans teach: On curriculum learning and teaching dimension. In NIPS. 1449–1457.

[25]

Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Deeply-recursive convolutional network for image super-resolution. In CVPR. 1637–1645.

[26]

Neeraj Kumar and Amit Sethi. 2016. Fast learning-based single image super-resolution. IEEE Trans. Multimedia 18, 8 (2016), 1504–1515.

Digital Library

[27]

Orest Kupyn, Volodymyr Budzan, Mykola Mykhailych, Dmytro Mishkin, and Jiri Matas. 2017. DeblurGAN: Blind motion deblurring using conditional adversarial networks. arXiv preprint arXiv:1711.07064 (2017).

[28]

Orest Kupyn, Volodymyr Budzan, Mykola Mykhailych, Dmytro Mishkin, and Jiri Matas. 2018. DeblurGAN: Blind motion deblurring using conditional adversarial networks. In CVPR.

[29]

Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. 2017. Deep Laplacian pyramid networks for fast and accurate superresolution. In CVPR, Vol. 2. 5.

[30]

Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew P. Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et al. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In CVPR, Vol. 2. 4.

[31]

Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In CVPR Workshops, Vol. 1. 4.

[32]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common objects in context. In ECCV. Springer, 740–755.

[33]

Xiaojiao Mao, Chunhua Shen, and Yu-Bin Yang. 2016. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In NIPS. 2802–2810.

[34]

Ishan Misra, Abhinav Shrivastava, Abhinav Gupta, and Martial Hebert. 2016. Cross-stitch networks for multi-task learning. In CVPR. 3994–4003.

[35]

Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. 2017. Deep multi-scale convolutional neural network for dynamic scene deblurring. In CVPR, Vol. 1. 3.

[36]

Mehdi Noroozi, Paramanand Chandramouli, and Paolo Favaro. 2017. Motion deblurring in the wild. In DAGM-GCPR. Springer, 65–77.

[37]

Guillaume Obozinski, Ben Taskar, and Michael I. Jordan. 2010. Joint covariate selection and joint subspace selection for multiple classification problems. Stat. Comput. 20, 2 (2010), 231–252.

Digital Library

[38]

Jinshan Pan, Deqing Sun, Hanspeter Pfister, and Ming-Hsuan Yang. 2016. Blind image deblurring using dark channel prior. In CVPR. 1628–1636.

[39]

Dongwon Park, Kwanyoung Kim, and Se Young Chun. 2018. Efficient module based single image super resolution for multiple problems. In CVPR Workshops.

[40]

Güngör Polatkan, Mingyuan Zhou, Lawrence Carin, David Blei, and Ingrid Daubechies. 2015. A Bayesian nonparametric approach to image super-resolution. IEEE Trans. Pattern Anal. Mach. Intell. 37, 2 (2015), 346–358.

[41]

Mehdi S. M. Sajjadi, Bernhard Schölkopf, and Michael Hirsch. 2017. Enhancenet: Single image super-resolution through automated texture synthesis. In ICCV. IEEE, 4501–4510.

[42]

Samuel Schulter, Christian Leistner, and Horst Bischof. 2015. Fast and accurate image upscaling with super-resolution forests. In CVPR. 3791–3799.

[43]

Wenzhe Shi, Jose Caballero, Ferenc Huszár, Johannes Totz, Andrew P. Aitken, Rob Bishop, Daniel Rueckert, and Zehan Wang. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In CVPR. 1874–1883.

[44]

Yukai Shi, Keze Wang, Chongyu Chen, Li Xu, and Liang Lin. 2017. Structure-preserving image super-resolution via contextualized multitask learning. IEEE Trans. Multimedia 19, 12 (2017), 2804–2815.

[45]

Assaf Shocher, Nadav Cohen, and Michal Irani. 2018. “Zero-Shot” super-resolution using deep internal learning. In CVPR. 3118–3126.

[46]

Pavel Svoboda, Michal Hradis, David Barina, and Pavel Zemcik. 2016. Compression artifacts removal using convolutional neural networks. arXiv preprint arXiv:1605.00366 (2016).

[47]

Ying Tai, Jian Yang, Xiaoming Liu, and Chunyan Xu. 2017. MemNet: A persistent memory network for image restoration. In CVPR. 4539–4547.

[48]

Xin Tao, Hongyun Gao, Yi Wang, Xiaoyong Shen, Jue Wang, and Jiaya Jia. 2018. Scale-recurrent network for deep image deblurring. In CVPR.

[49]

Piotr Teterwak and Lorenzo Torresani. 2014. Shared Roots: Regularizing Deep Neural Networks through Multitask Learning. Dartmouth College Undergraduate Theses. 92. https://digitalcommons.dartmouth.edu/senior_theses/92.

[50]

Tong Tong, Gen Li, Xiejie Liu, and Qinquan Gao. 2017. Image super-resolution using dense skip connections. In ICCV. IEEE, 4809–4817.

[51]

Xintao Wang, Ke Yu, Chao Dong, and Chen Change Loy. 2018. Recovering realistic texture in image super-resolution by deep spatial feature transform. In CVPR.

[52]

Yifan Wang, Federico Perazzi, Brian McWilliams, Alexander Sorkine-Hornung, Olga Sorkine-Hornung, and Christopher Schroers. 2018. A fully progressive approach to single-image super-resolution. arXiv preprint arXiv:1804.02900 (2018).

[53]

Ou Wu, Haiqiang Zuo, Weiming Hu, and Bing Li. 2016. Multimodal web aesthetics assessment based on structural SVM and multitask fusion learning. IEEE Trans. Multimedia 18, 6 (2016), 1062–1076.

[54]

Li Xu, Shicheng Zheng, and Jiaya Jia. 2013. Unnatural L0 sparse representation for natural image deblurring. In CVPR. 1107–1114.

[55]

Wenming Yang, Yapeng Tian, Fei Zhou, Qingmin Liao, Hai Chen, and Chenglin Zheng. 2016. Consistent coding scheme for single-image super-resolution via independent dictionaries. IEEE Trans. Multimedia 18, 3 (2016), 313–325.

Digital Library

[56]

Xin Yang, Haiyang Mei, Jiqing Zhang, Ke Xu, Baocai Yin, Qiang Zhang, and Xiaopeng Wei. 2019. DRFN: Deep recurrent fusion network for single-image super-resolution with large factors. IEEE Trans. Multimedia 21, 2 (2019), 328–337.

Digital Library

[57]

Yi Yang, Zhigang Ma, Alexander G. Hauptmann, and Nicu Sebe. 2013. Feature selection for multimedia analysis by sharing information among multiple tasks. IEEE Trans. Multimedia 15, 3 (2013), 661–669.

Digital Library

[58]

Junho Yim, Heechul Jung, ByungIn Yoo, Changkyu Choi, Dusik Park, and Junmo Kim. 2015. Rotating your face using multi-task deep neural network. In CVPR. 676–684.

[59]

Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks? In NIPS. 3320–3328.

[60]

Ke Yu, Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016. Deep convolution networks for compression artifacts reduction. arXiv preprint arXiv:1608.02778 (2016).

[61]

Yuan Yuan, Siyuan Liu, Jiawei Zhang, Yongbing Zhang, Chao Dong, and Liang Lin. 2018. Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks. In CVPR. 701–710.

[62]

Roman Zeyde, Michael Elad, and Matan Protter. 2010. On single image scale-up using sparse-representations. In ICCS. Springer, 711–730.

Digital Library

[63]

Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2018. Learning a single convolutional super-resolution network for multiple degradations. In CVPR, Vol. 6.

[64]

Kai Zhang, Wangmeng Zuo, and Lei Zhang. 2019. Deep plug-and-play super-resolution for arbitrary blur kernels. In CVPR. 1671–1681.

[65]

Xinyi Zhang, Hang Dong, Zhe Hu, Wei-Sheng Lai, Fei Wang, and Ming-Hsuan Yang. 2018. Gated fusion network for joint image deblurring and super-resolution. In BMVC.

[66]

Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018. Image super-resolution using very deep residual channel attention networks. In ECCV.

[67]

Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018. Image super-resolution using very deep residual channel attention networks. In ECCV.

[68]

Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, and Yun Fu. 2018. Residual dense network for image super-resolution. In CVPR.

[69]

Wei Zhao, Benyou Wang, Jianbo Ye, Min Yang, Zhou Zhao, Ruotian Luo, and Yu Qiao. 2018. A multi-task learning approach for image captioning. In IJCAI.

[70]

Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In ICCV.

[71]

Zhiliang Zhu, Fangda Guo, Hai Yu, and Chen Chen. 2014. Fast single image super-resolution via self-example learning and sparse representation. IEEE Trans. Multimedia 16, 8 (2014), 2178–2190.

Cited By

Li HZhang ZJiang TLuo PFeng HXu ZWilliams BChen YNeville J(2023)Real-world deep local motion deblurringProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i1.25215(1314-1322)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i1.25215
Guo KChen LZhu XKui XZhang JShi H(2023)Double-Layer Search and Adaptive Pooling Fusion for Reference-Based Image Super-ResolutionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/360493720:1(1-23)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1145/3604937
Hsu WJian P(2023)Recurrent Multi-scale Approximation-Guided Network for Single Image Super-ResolutionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/359261319:6(1-21)Online publication date: 14-Apr-2023
https://dl.acm.org/doi/10.1145/3592613
Show More Cited By

Index Terms

Multi-task Learning-based All-in-one Collaboration Framework for Degraded Image Super-resolution
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Reconstruction

Recommendations

Fast Learning-Based Single Image Super-Resolution

We present a learning-based single image super-resolution (SISR) method to obtain a high resolution (HR) image from a single given low resolution (LR) image. Our method gives more accurate results while also testing (runs) and training faster with a ...
Improvement of Text Image Super-Resolution Benefiting Multi-task Learning
Advances and Trends in Artificial Intelligence. Theory and Practices in Artificial Intelligence
Abstract
Text image super-resolution is a pre-processing of scene text recognition, which aims to improve the visual quality of text from low-resolution images. However, existing super-resolution (SR) models designed for general images have difficulty in ...
Image super-resolution: use of self-learning and gabor prior
ACCV'12: Proceedings of the 11th Asian conference on Computer Vision - Volume Part III

Recent approaches on single image super-resolution (SR) have attempted to exploit self-similarity to avoid the use of multiple images. In this paper, we propose an SR method based on self-learning and Gabor prior. Given a low resolution (LR) test image ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 17, Issue 1

February 2021

392 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3453992

Editor:
Alberto Del Bimbo
University of Firenze, Italy

Issue’s Table of Contents

Copyright © 2021 Association for Computing Machinery.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 April 2021

Accepted: 01 August 2020

Revised: 01 August 2020

Received: 01 January 2020

Published in TOMM Volume 17, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Refereed

Funding Sources

NSFC

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
301
Total Downloads

Downloads (Last 12 months)37
Downloads (Last 6 weeks)5

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li HZhang ZJiang TLuo PFeng HXu ZWilliams BChen YNeville J(2023)Real-world deep local motion deblurringProceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v37i1.25215(1314-1322)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1609/aaai.v37i1.25215
Guo KChen LZhu XKui XZhang JShi H(2023)Double-Layer Search and Adaptive Pooling Fusion for Reference-Based Image Super-ResolutionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/360493720:1(1-23)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1145/3604937
Hsu WJian P(2023)Recurrent Multi-scale Approximation-Guided Network for Single Image Super-ResolutionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/359261319:6(1-21)Online publication date: 14-Apr-2023
https://dl.acm.org/doi/10.1145/3592613
Yang HWang ZLiu XLi CXin JWang Z(2023)Deep learning in medical image super resolution: a reviewApplied Intelligence10.1007/s10489-023-04566-953:18(20891-20916)Online publication date: 1-Sep-2023
https://dl.acm.org/doi/10.1007/s10489-023-04566-9
Wang WLin LFan ZLiu J(2022)Semi-supervised Learning for Mars Imagery Classification and SegmentationACM Transactions on Multimedia Computing, Communications, and Applications10.1145/357291619:4(1-23)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1145/3572916

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Issue’s Table of Contents