Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3394171.3413613acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Dual-view Attention Networks for Single Image Super-Resolution

Published: 12 October 2020 Publication History

Abstract

One non-negligible flaw of the convolutional neural networks (CNNs) based single image super-resolution (SISR) models is that most of them are not able to restore high-resolution (HR) images containing sufficient high-frequency information. Worse still, as the depth of CNNs increases, the training easily suffers from the vanishing gradients. These problems hinder the effectiveness of CNNs in SISR. In this paper, we propose the Dual-view Attention Networks to alleviate these problems for SISR. Specifically, we propose the local aware (LA) and global aware (GA) attentions to deal with LR features in unequal manners, which can highlight the high-frequency components and discriminate each feature from LR images in the local and global views, respectively. Furthermore, the local attentive residual-dense (LARD) block that combines the LA attention with multiple residual and dense connections is proposed to fit a deeper yet easy to train architecture. The experimental results verified the effectiveness of our model compared with other state-of-the-art methods.

Supplementary Material

MP4 File (3394171.3413613.mp4)
One non-negligible flaw of the CNNs-based single image super-resolution (SISR) models is that most of them are not able to restore high-resolution (HR) images containing sufficient high-frequency information. Worse still, as the depth of CNNs increases, the training easily suffers from the vanishing gradients. These problems hinder the effectiveness of CNNs in SISR. In this paper, we propose Dual-view Attention Networks to alleviate these problems for SISR. Specifically, we propose the local aware (LA) and global aware (GA) attentions to deal with LR features in unequal manners, which can highlight the high-frequency components and discriminate each feature from LR images in the local and global views, respectively. Furthermore, the local attentive residual-dense (LARD) block that combines the LA attention with multiple residual and dense connections is proposed to fit a deeper yet easy to train architecture. (You may refer to our paper for more details).

References

[1]
Pablo Arbelaez, Michael Maire, Charless Fowlkes, and Jitendra Malik. 2011. Contour detection and hierarchical image segmentation. IEEE transactions on pattern analysis and machine intelligence, Vol. 33, 5 (2011), 898--916.
[2]
Jimmy Ba, Volodymyr Mnih, and Koray Kavukcuoglu. 2015. Multiple Object Recognition with Visual Attention. In Proceedings of the International Conference on Learning Representations (ICLR).
[3]
Simon Baker and Takeo Kanade. 2002. Limits on super-resolution and how to break them. IEEE Transactions on Pattern Analysis & Machine Intelligence 9 (2002), 1167--1183.
[4]
Pawel Benecki, Michal Kawulok, Daniel Kostrzewa, and Lukasz Skonieczny. 2018. Evaluating super-resolution reconstruction of satellite images. Acta Astronautica, Vol. 153 (2018), 15--25.
[5]
Marco Bevilacqua, Aline Roumy, Christine Guillemot, and Marie Line Alberi-Morel. 2012. Low-complexity single-image super-resolution based on nonnegative neighbor embedding. (2012).
[6]
Kan Chen, Jiang Wang, Liang-Chieh Chen, Haoyuan Gao, Wei Xu, and Ram Nevatia. 2015. Abc-cnn: An attention based convolutional neural network for visual question answering. arXiv preprint arXiv:1511.05960 (2015).
[7]
Chao Dong, Chen Change Loy, Kaiming He, and Xiaoou Tang. 2016b. Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence, Vol. 38, 2 (2016), 295--307.
[8]
Chao Dong, Chen Change Loy, and Xiaoou Tang. 2016a. Accelerating the super-resolution convolutional neural network. In European conference on computer vision. Springer, 391--407.
[9]
Daniel Glasner, Shai Bagon, and Michal Irani. 2009. Super-resolution from a single image. In 2009 IEEE 12th International Conference on Computer Vision (ICCV). IEEE, 349--356.
[10]
Muhammad Haris, Gregory Shakhnarovich, and Norimichi Ukita. 2018. Deep back-projection networks for super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1664--1673.
[11]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.
[12]
Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-excitation networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7132--7141.
[13]
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700--4708.
[14]
Jia-Bin Huang, Abhishek Singh, and Narendra Ahuja. 2015. Single image super-resolution from transformed self-exemplars. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5197--5206.
[15]
Yan Huang, Wei Wang, and Liang Wang. 2018. Video super-resolution via bidirectional recurrent convolutional networks. IEEE transactions on pattern analysis and machine intelligence, Vol. 40, 4 (2018), 1015--1028.
[16]
Muwei Jian and Kin-Man Lam. 2015. Simultaneous hallucination and recognition of low-resolution faces based on singular value decomposition. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 25, 11 (2015), 1761--1772.
[17]
Muwei Jian, Kin-Man Lam, and Junyu Dong. 2013. A novel face-hallucination scheme based on singular value decomposition. Pattern Recognition, Vol. 46, 11 (2013), 3091--3102.
[18]
Jiwon Kim, Jung Kwon Lee, and Kyoung Mu Lee. 2016. Accurate image super-resolution using very deep convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1646--1654.
[19]
Wei-Sheng Lai, Jia-Bin Huang, Narendra Ahuja, and Ming-Hsuan Yang. 2017. Deep laplacian pyramid networks for fast and accurate super-resolution. In Proceedings of the IEEE conference on computer vision and pattern recognition. 624--632.
[20]
Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, et almbox. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4681--4690.
[21]
Xin Li and Michael T Orchard. 2001. New edge-directed interpolation. IEEE transactions on image processing, Vol. 10, 10 (2001), 1521--1527.
[22]
Bee Lim, Sanghyun Son, Heewon Kim, Seungjun Nah, and Kyoung Mu Lee. 2017. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 136--144.
[23]
Zhouchen Lin and Heung-Yeung Shum. 2004. Fundamental limits of reconstruction-based superresolution algorithms under local translation. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, 1 (2004), 83--97.
[24]
Yusuke Matsui, Kota Ito, Yuji Aramaki, Azuma Fujimoto, Toru Ogawa, Toshihiko Yamasaki, and Kiyoharu Aizawa. 2017. Sketch-based manga retrieval using manga109 dataset. Multimedia Tools and Applications, Vol. 76, 20 (2017), 21811--21838.
[25]
Seungjun Nah, Tae Hyun Kim, and Kyoung Mu Lee. 2017. Deep multi-scale convolutional neural network for dynamic scene deblurring. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3883--3891.
[26]
Ozan Oktay, Enzo Ferrante, Konstantinos Kamnitsas, Mattias Heinrich, Wenjia Bai, Jose Caballero, Stuart A Cook, Antonio De Marvao, Timothy Dawes, Declan P O'Regan, et almbox. 2018. Anatomically constrained neural networks (ACNNs): application to cardiac image enhancement and segmentation. IEEE transactions on medical imaging, Vol. 37, 2 (2018), 384--395.
[27]
Mehdi SM Sajjadi, Bernhard Scholkopf, and Michael Hirsch. 2017. Enhancenet: Single image super-resolution through automated texture synthesis. In Proceedings of the IEEE International Conference on Computer Vision. 4491--4500.
[28]
Jian Sun, Zongben Xu, and Heung-Yeung Shum. 2008. Image super-resolution using gradient profile prior. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1--8.
[29]
Yu-Wing Tai, Shuaicheng Liu, Michael S Brown, and Stephen Lin. 2010. Super resolution using edge prior and single image detail synthesis. In 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. IEEE, 2400--2407.
[30]
Radu Timofte, Eirikur Agustsson, Luc Van Gool, Ming-Hsuan Yang, Lei Zhang, Bee Lim, et almbox. 2017. NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops.
[31]
T onis Uiboupin, Pejman Rasti, Gholamreza Anbarjafari, and Hasan Demirel. 2016. Facial image super resolution using sparse representation for improving face recognition in surveillance monitoring. In 2016 24th Signal Processing and Communication Application Conference (SIU). IEEE, 437--440.
[32]
Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, and Xiaoou Tang. 2017. Residual attention network for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3156--3164.
[33]
Huijuan Xu and Kate Saenko. 2016. Ask, attend and answer: Exploring question-guided spatial attention for visual question answering. In European Conference on Computer Vision. Springer, 451--466.
[34]
Shipeng Yan, Songyang Zhang, Xuming He, et almbox. 2019. A Dual Attention Network with Semantic Embedding for Few-shot Learning. (2019).
[35]
Chih-Yuan Yang and Ming-Hsuan Yang. 2013. Fast direct super-resolution by simple functions. In Proceedings of the IEEE international conference on computer vision. 561--568.
[36]
Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alex Smola. 2016. Stacked attention networks for image question answering. In Proceedings of the IEEE conference on computer vision and pattern recognition. 21--29.
[37]
Minghao Yin, Yongbing Zhang, Xiu Li, and Shiqi Wang. 2018. When Deep Fool Meets Deep Prior: Adversarial Attack on Super-Resolution Network. In 2018 ACM Multimedia Conference on Multimedia Conference. ACM, 1930--1938.
[38]
Jason Yosinski, Jeff Clune, Yoshua Bengio, and Hod Lipson. 2014. How transferable are features in deep neural networks?. In Advances in neural information processing systems. 3320--3328.
[39]
Yuan Yuan, Siyuan Liu, Jiawei Zhang, Yongbing Zhang, Chao Dong, and Liang Lin. 2018. Unsupervised image super-resolution using cycle-in-cycle generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 701--710.
[40]
Roman Zeyde, Michael Elad, and Matan Protter. 2010. On single image scale-up using sparse-representations. In International conference on curves and surfaces. Springer, 711--730.
[41]
Lei Zhang and Xiaolin Wu. 2006. An edge-guided image interpolation algorithm via directional filtering and data fusion. IEEE transactions on Image Processing, Vol. 15, 8 (2006), 2226--2238.
[42]
Yulun Zhang, Kunpeng Li, Kai Li, Lichen Wang, Bineng Zhong, and Yun Fu. 2018a. Image super-resolution using very deep residual channel attention networks. In Proceedings of the European Conference on Computer Vision (ECCV). 286--301.
[43]
Yulun Zhang, Yapeng Tian, Yu Kong, Bineng Zhong, and Yun Fu. 2018b. Residual dense network for image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2472--2481.

Cited By

View all
  • (2024)RTNN: A Neural Network-Based In-Loop Filter in VVC Using Resblock and TransformerIEEE Access10.1109/ACCESS.2024.343152712(104599-104610)Online publication date: 2024
  • (2024)A joint image super‐resolution network for multiple degradations removal via complementary transformer and convolutional neural networkIET Image Processing10.1049/ipr2.1303018:5(1344-1357)Online publication date: 15-Jan-2024
  • (2024)Next-gen image enhancement: CapsNet-driven auto-encoder model in single image super resolutionMultimedia Tools and Applications10.1007/s11042-024-18798-5Online publication date: 11-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '20: Proceedings of the 28th ACM International Conference on Multimedia
October 2020
4889 pages
ISBN:9781450379885
DOI:10.1145/3394171
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. convolutional neural networks
  2. dual-view aware attention
  3. highlight
  4. super-resolution

Qualifiers

  • Research-article

Funding Sources

Conference

MM '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)1
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)RTNN: A Neural Network-Based In-Loop Filter in VVC Using Resblock and TransformerIEEE Access10.1109/ACCESS.2024.343152712(104599-104610)Online publication date: 2024
  • (2024)A joint image super‐resolution network for multiple degradations removal via complementary transformer and convolutional neural networkIET Image Processing10.1049/ipr2.1303018:5(1344-1357)Online publication date: 15-Jan-2024
  • (2024)Next-gen image enhancement: CapsNet-driven auto-encoder model in single image super resolutionMultimedia Tools and Applications10.1007/s11042-024-18798-5Online publication date: 11-Apr-2024
  • (2023)Unfolding Once is Enough: A Deployment-Friendly Transformer Unit for Super-ResolutionProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612128(7952-7960)Online publication date: 26-Oct-2023
  • (2023)Towards Fairer and More Efficient Federated Learning via Multidimensional Personalized Edge Models2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191956(1-8)Online publication date: 18-Jun-2023
  • (2022)Quality Assessment of Image Super-Resolution: Balancing Deterministic and Statistical FidelityProceedings of the 30th ACM International Conference on Multimedia10.1145/3503161.3547899(934-942)Online publication date: 10-Oct-2022
  • (2022)Cross Parallax Attention Network for Stereo Image Super-ResolutionIEEE Transactions on Multimedia10.1109/TMM.2021.305009224(202-216)Online publication date: 2022
  • (2021)Pyramid Information Distillation Attention Network for Super-Resolution Reconstruction of Remote Sensing ImagesRemote Sensing10.3390/rs1324514313:24(5143)Online publication date: 17-Dec-2021
  • (2021)Parallax‐based second‐order mixed attention for stereo image super‐resolutionIET Computer Vision10.1049/cvi2.1206316:1(26-37)Online publication date: 19-Jun-2021
  • (2021)Progressive Multi-scale Reconstruction for Guided Depth Map Super-Resolution via Deep Residual Gate Fusion NetworkAdvances in Computer Graphics10.1007/978-3-030-89029-2_5(67-79)Online publication date: 11-Oct-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media