Abstract
Most Deep learning-based inpainting approaches cannot effectively perceive and present image information at different scales. More often than not, they adopt spatial attention to utilize information on the image background and ignore the effect of channel attention. Hence, they usually produce blurred and poor-quality restored images. In this paper, we propose a novel Res2U-Net backbone architecture to solve these problems. Both encoder and decoder layers of our Res2U-Net employ multi-scale residual structures, which can respectively extract and express multi-scale features of images. Moreover, we modify the network by using the channel attention and introduce a dilated multi-scale channel-attention block that is embedded into the skip-connection layers of our Res2U-Net. This network block can take advantage of low-level features of the encoder layers in our inpainting network. Experiments conducted on the CelebA-HQ and Paris StreetView datasets demonstrate that our Res2U-Net architecture achieves superior performance and outperforms the state-of-the-art inpainting approaches in both qualitative and quantitative aspects.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Doersch, C., Singh, S., Gupta, A., Sivic, J., Efros, A.A.: What makes Paris look like Paris? ACM Trans. Graph. 31(4) (2012)
Gao, S.H., Cheng, M.M., Zhao, K., Zhang, X.Y., Yang, M.H., Torr, P.: Res2Net: a new multi-scale backbone architecture. arXiv e-prints arXiv:1904.01169 (Apr 2019)
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141, June 2018
Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Trans. Graph. 36(4), 1–14 (2017)
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv e-prints arXiv:1502.03167 (2015)
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. arXiv e-prints arXiv:1611.07004 (2016)
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Karras, T., Aila, T., Laine, S., Lehtinen, J.: Progressive growing of GANs for improved quality, stability, and variation. arXiv e-prints arXiv:1710.10196 (2017)
Liu, G., Reda, F.A., Shih, K.J., Wang, T.C., Tao, A., Catanzaro, B.: Image inpainting for irregular holes using partial convolutions. arXiv e-prints arXiv:1804.07723 (2018)
Liu, H., Jiang, B., Xiao, Y., Yang, C.: Coherent semantic attention for image inpainting. arXiv e-prints arXiv:1905.12384 (2019)
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2536–2544. IEEE, New York (2016)
Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. arXiv e-prints arXiv:1505.04597 (2015)
Wang, Y., Tao, X., Qi, X., Shen, X., Jia, J.: Image inpainting via generative multi-column convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 331–340 (2018)
Yan, Z., Li, X., Li, M., Zuo, W., Shan, S.: Shift-net: image inpainting via deep feature rearrangement. arXiv e-prints arXiv:1801.09392 (2018)
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Generative image inpainting with contextual attention. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5505–5514, June 2018
Yu, J., Lin, Z., Yang, J., Shen, X., Lu, X., Huang, T.S.: Free-form image inpainting with gated convolution. In: The IEEE International Conference on Computer Vision (ICCV), pp. 4470–4479, October 2019
Zhang, Y., Li, K., Li, K., Wang, L., Zhong, B., Fu, Y.: Image super-resolution using very deep residual channel attention networks. In: The European Conference on Computer Vision (ECCV), pp. 286–301, September 2018
Zheng, C., Cham, T.J., Cai, J.: Pluralistic image completion. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1438–1447, June 2019
Acknowledgements
This work was supported by the National Natural Science Foundation of China (Grant No. 61263048) and, by the Applied Basic Research Project of Yunnan Province (Grant No. 2018FB102), and by the Young and Middle-Aged Backbone Teachers’ Cultivation Plan of Yunnan University (Grant No. XT412003).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, H., Yu, Y. (2020). Res2U-Net: Image Inpainting via Multi-scale Backbone and Channel Attention. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Lecture Notes in Computer Science(), vol 12532. Springer, Cham. https://doi.org/10.1007/978-3-030-63830-6_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-63830-6_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63829-0
Online ISBN: 978-3-030-63830-6
eBook Packages: Computer ScienceComputer Science (R0)