Abstract
In recent years, we have witnessed the great success of deep learning on various problems both in low and high-level computer visions. The low-level vision problems, including inpainting, deblurring, denoising, super-resolution, and so on, are highly anticipated to occur in machine vision and image processing. Many deep learning based methods have been proposed to solve low-level vision problems. Most researches treat these problems independently; however, most of the time they appear concurrently. Motivated by the success of generative model in the field of image generation, we develop a deep cascade of neural networks to solve the inpainting, deblurring, denoising problems at the same time. Our model contains two networks: inpainting GAN and deblurring-denoising network. Inpainting GAN generates the coarse patches to fill the lost part in damaged image, and the deblurring-denoising network, stacked by a convolutional auto-encoder, will further refine them. Unlike other methods that handle each problem separately, our method jointly optimizes the two sub-networks. Because GAN training is not only unstable but also difficult, we adopt the Wasserstein distance as the loss function of the inpainting GAN and propose a gradual training strategy. Learning from the idea of residual learning, we utilize skip connections to pass image details from input to reconstruction layer. Experimental results have demonstrated that the proposed model can achieve state-of-the-art performance. Through the experiments, we also demonstrated the effectiveness of the cascade architecture.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein gan. arXiv preprint arXiv:170107875
Barnes C, Shechtman E, Finkelstein A, Goldman DB (2009) PatchMatch: a randomized correspondence algorithm for structural image editing. ACM Trans Graph 28 (3):24:21–24:11
Beck A, Teboulle M (2009) Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE Trans Image Process 18(11):2419–2434
Bengio Y, Yao L, Alain G, Vincent P (2013) Generalized denoising auto-encoders as generative models. Adv Neural Inf Proces Syst:899–907
Bertalmio M, Sapiro G, Caselles V, Ballester C (2000) Image inpainting. In: Proceedings of the 27th annual conference on computer graphics and interactive techniques. ACM Press/Addison-Wesley Publishing Co., pp 417–424
Bertalmio M, Vese L, Sapiro G, Osher S (2003) Simultaneous structure and texture image inpainting. IEEE Trans Image Process 12(8):882–889
Burger HC, Schuler CJ, Harmeling S (2012) Image denoising: can plain neural networks compete with BM3D? In: Computer Vision and Pattern Recognition (CVPR), 2012 I.E. Conference on, IEEE, pp 2392–2399
Cai J-F, Ji H, Liu C, Shen Z (2009) Blind motion deblurring from a single image using sparse approximation. In: Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, IEEE, pp 104–111
Cho S, Lee S (2009) Fast motion deblurring. ACM Trans Graph 28(5):1–8. https://doi.org/10.1145/1618452.1618491
Criminisi A, Perez P, Toyama K (2003) Object removal by exemplar-based inpainting. In: Computer vision and pattern recognition, 2003. Proceedings. 2003 I.E. Computer Society Conference on, IEEE, pp II-II
Criminisi A, Pérez P, Toyama K (2004) Region filling and object removal by exemplar-based image inpainting. IEEE Trans Image Process 13(9):1200–1212
Dabov K, Foi A, Katkovnik V, Egiazarian K (2007) Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans Image Process 16(8):2080–2095
Dong W, Zhang L, Shi G, Wu X (2011) Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization. IEEE Trans Image Process 20(7):1838–1857
Elad M, Aharon M (2006) Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans Image Process 15(12):3736–3745
Gharbi M, Chaurasia G, Paris S, Durand F (2016) Deep joint demosaicking and denoising. ACM Trans Graph (TOG) 35(6):191
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. International Conference on Neural Information Processing Systems, In, pp 2672–2680
Hays J, Efros AA (2007) Scene completion using millions of photographs. In: ACM SIGGRAPH, p 4
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Isola P, Zhu J-Y, Zhou T, Efros AA (2016) Image-to-image translation with conditional adversarial networks. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1125–1134
Ji H, Wang K (2012) Robust image deblurring with an inaccurate blur kernel. IEEE Trans Image Process 21(4):1624–1634
Johnson J, Alahi A, Fei-Fei L (2016) Perceptual losses for real-time style transfer and super-resolution. In: European conference on computer vision (ECCV). pp 694–711
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp 1097–1105
Le Meur O, Ebdelli M, Guillemot C (2013) Hierarchical super-resolution-based inpainting. IEEE Trans Image Process 22(10):3779–3790
Liu D, Sun X, Wu F, Li S, Zhang Y-Q (2007) Image compression with edge-based inpainting. IEEE Trans Circuits Syst Video Technol 17(10):1273–1287
Liu J, Shang S, Zheng K, Wen J-R (2016) Multi-view ensemble learning for dementia diagnosis from neuroimaging: an artificial neural network approach. Neurocomputing 195:112–116
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3431–3440
Martin D, Fowlkes C, Tal D, Malik J (2001) A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: Computer Vision, 2001 ICCV 2001 Proceedings Eighth IEEE International Conference on, IEEE, pp 416–423
Pathak D, Krahenbuhl P, Donahue J, Darrell T, Efros AA (2016) Context encoders: feature learning by inpainting. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2536–2544
Radford A, Metz L, Chintala S (2016) Unsupervised representation learning with deep convolutional generative adversarial networks. In: International Conference on Learning Representation (ICLR)
Shan Q, Jia J, Agarwala A (2008) High-quality motion deblurring from a single image. ACM Trans Graph 27(3):1–10
Shang S, Liu J, Zhao K, Yang M, Zheng K, Wen J-r (2015) Dimension reduction with meta object-groups for efficient image retrieval. Neurocomputing 169:50–54
Shang S, Guo D, Liu J, Zheng K, Wen J-R (2016) Finding regions of interest using location based social media. Neurocomputing 173:118–123
Shang S, Guo D, Liu J, Wen J-R (2016) Prediction-based unobstructed route planning. Neurocomputing 213:147–154
Shao L, Yan R, Li X, Liu Y (2014) From heuristic optimization to dictionary learning: a review and comprehensive comparison of image denoising algorithms. IEEE Trans Cybern 44(7):1001–1013
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representation (ICLR), pp 1–14
Sun J, Cao W, Xu Z, Ponce J (2015) Learning a convolutional neural network for non-uniform motion blur removal. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 769–777
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P-A (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J Mach Learn Res 11:3371–3408
Vondrick C, Pirsiavash H, Torralba A (2016) Generating videos with scene dynamics. In: International Conference on Neural Information Processing Systems, pp 613–621
Wang Z, Bovik AC, Sheikh HR, Simoncelli EP (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13(4):600–612
Xu L, Ren JS, Liu C, Jia J (2014) Deep convolutional neural network for image deconvolution. In: International Conference on Neural Information Processing Systems, pp 1790–1798
Yang C, Lu X, Lin Z, Shechtman E, Wang O, Li H (2017) High-resolution image inpainting using multiscale neural patch synthesis. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 6721–6729
Zhang K, Zuo W, Chen Y, Meng D, Zhang L (2017) Beyond a Gaussian Denoiser: residual learning of deep CNN for image denoising. IEEE Trans Image Process 26(7):3142–3155
Zhu S, Wang Y, Shang S, Zhao G, Wang J (2017) Probabilistic routing using multimodal data. Neurocomputing 253:49–55
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, G., Liu, J., Jiang, J. et al. A deep cascade of neural networks for image inpainting, deblurring and denoising. Multimed Tools Appl 77, 29589–29604 (2018). https://doi.org/10.1007/s11042-017-5320-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-017-5320-7