Abstract
Photorealistic style transfer concerns rendering the style of a reference image to a content image with the restraint that the stylized image should be realistic. While the existing methods have achieved promising results, they are prone to generating either structural distortions or inconsistent style due to the lack of effective style representation. In this work, to represent the inherent style information effectively, we propose a two-branch learnable transfer mechanism by considering the complementary advantages of the first-order and second-order image statistics simultaneously. Instead of directly using these two image statistics, we design a learnable transfer branch to implement the second-order image statistics learning to capture the consistent style and improve the efficiency. We further use a multi-scale representation branch to retain more structural details of the content image. In addition, a lightweight but effective adaptive-aggregation mechanism is proposed to fuse the features across different branches dynamically to balance between the consistent style and photorealism. Qualitative and quantitative experiments demonstrate that the proposed method renders the image faithfully with photorealistic results and high efficiency.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Bodla N, Zheng J, Xu H, Chen JC, Castillo C, Chellappa R (2017) Deep heterogeneous feature fusion for template-based face recognition. In: Proceedings of the IEEE winter conference on applications of computer vision (WACV), pp 586–595
Chang HY, Wang Z, Chuang YY (2020) Domain-specific mappings for generative adversarial style transfer. In: Proceedings of the IEEE European conference on computer vision (ECCV), pp 573–589
Cheng MM, Liu XC, Wang J, Lu SP, Lai YK, Rosin PL (2019) Structure-preserving neural style transfer. IEEE Transactions on Image Processing 29:909–920
Chiu TY, Gurari D (2020) Iterative feature transformation for fast and versatile universal style transfer. In: Proceedings of the IEEE European conference on computer vision (ECCV), pp 169–184
Dai Y, Gieseke F, Oehmcke S, Wu Y, Barnard K (2021) Attentional feature fusion. In: Proceedings of the IEEE winter conference on applications of computer vision(WACV), pp 3560–3569
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 248–255
Fu Y, Cao L, Guo G, Huang TS (2008) Multiple feature fusion by subspace learning. In: Proceedings of the IEE international conference on content-based image and video retrieval(CIVR), pp 127–134
Gardini E, Ferrarotti MJ, Cavalli A, Decherchi S (2021) Using principal paths to walk through music and visual art style spaces induced by convolutional neural networks. Cognitive Computation 13(2):570–582
Gatys LA, Ecker AS, Bethge M (2016) Image style transfer using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 2414–2423
Gatys LA, Ecker AS, Bethge M, Hertzmann A, Shechtman E (2017) Controlling perceptual factors in neural style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 3985–3993
Gharbi M, Chen J, Barron JT, Hasinoff SW, Durand F (2017) Deep bilateral learning for real-time image enhancement. ACM Transactions on Graphics (TOG) 36(4):1–12
Gu S, Chen C, Liao J, Yuan L (2018) Arbitrary style transfer with deep feature reshuffle. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 8222–8231
He K, Zhang X, Ren S, Sun J (2016a) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 770–778
He K, Zhang X, Ren S, Sun J (2016b) Identity mappings in deep residual networks. In: Proceedings of the IEEE European conference on computer vision (ECCV), pp 630–645
Hong K, Jeon S, Yang H, Fu J, Byun H (2021) Domain-aware universal style transfer. In: Proceedings of the IEEE international conference on computer vision(ICCV), pp 14609–14617
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 7132–7141
Huang X, Belongie S (2017) Arbitrary style transfer in real-time with adaptive instance normalization. In: Proceedings of the IEEE international conference on computer vision(ICCV), pp 1501–1510
Huo J, Jin S, Li W, Wu J, Lai YK, Shi Y, Gao Y (2021) Manifold alignment for semantically aligned style transfer. In: Proceedings of the IEEE international conference on computer vision, pp 14861–14869
Iandola F, Moskewicz M, Karayev S, Girshick R, Keutzer K (2014) Densenet: Implementing efficient convnet descriptor pyramids. arXiv:14041869
Liao JLYGH, Yao Y, Kang SB (2021) Visual attribute transfer through deep image analogy. ACM Transactions on Graphics (TOG) 36(4):120
Karras T, Laine S, Aittala M, Hellsten J, Lehtinen J, Aila T (2020) Analyzing and improving the image quality of stylegan. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 8110–8119
Kazemi H, Iranmanesh SM, Nasrabadi N (2019) Style and content disentanglement in generative adversarial networks. In: Proceedings of the IEEE winter conference on applications of computer vision (WACV), pp 848–856
Kim M, Choi HC (2021) Uncorrelated feature encoding for faster image style transfer. Neural Networks 140:148–157
Kim SS, Kolkin N, Salavon J, Shakhnarovich G (2020) Deformable style transfer. In: Proceedings of the IEEE European conference on computer vision (ECCV), pp 246–261
Leclaire A, Rabin J (2021) A stochastic multi-layer algorithm for semi-discrete optimal transport with applications to texture synthesis and style transfer. Journal of Mathematical Imaging and Vision 63(2):282–308
Li X, Liu S, Kautz J, Yang MH (2019a) Learning linear transformations for fast image and video style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 3809–3817
Li X, Wang W, Hu X, Yang J (2019b) Selective kernel networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 510–519
Li X, Zhang S, Hu J, Cao L, Hong X, Mao X, Huang F, Wu Y, Ji R (2021) Image-to-image translation via hierarchical style disentanglement. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 8639–8648
Li Y, Fang C, Yang J, Wang Z, Lu X, Yang MH (2017) Universal style transfer via feature transforms. In: Conference and workshop on neural information processing systems(NeurIPS)
Li Y, Liu MY, Li X, Yang MH, Kautz J (2018) A closed-form solution to photorealistic image stylization. In: Proceedings of the IEEE European conference on computer vision (ECCV), pp 453–468
Lin TY, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In: Proceedings of the IEEE European conference on computer vision (ECCV), pp 740–755
Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 2117–2125
Liu S, De Mello S, Gu J, Zhong G, Yang MH, Kautz J (2017) Learning affinity via spatial propagation networks. In: Conference and workshop on neural information processing systemsNIPS, pp 1–14
Luan F, Paris S, Shechtman E, Bala K (2017) Deep photo style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 4990–4998
Misra D, Nalamada T, Arasanipalai AU, Hou Q (2021) Rotate to attend: Convolutional triplet attention module. In: Proceedings of the IEEE winter conference on applications of computer vision (WACV), pp 3139–3148
Mohammad S, Kiritchenko S (2018) Wikiart emotions: An annotated dataset of emotions evoked by art. In: Proceedings of the eleventh international conference on language resources and evaluation (LREC)
Qiao Y, Cui J, Huang F, Liu H, Bao C, Li X (2021) Efficient style-corpus constrained learning for photorealistic style transfer. IEEE Transactions on Image Processing 30:3154–3166
Risser E, Wilmot P, Barnes C (2017) Stable and controllable neural texture synthesis and style transfer using histogram losses. arXiv:170108893
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation. In: Proceedings of the IEEE international conference on medical image computing and computer-assisted intervention (MICCAI), pp 234–241
Sheng L, Lin Z, Shao J, Wang X (2018) Avatar-net: Multi-scale zero-shot style transfer by feature decoration. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 8242–8250
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations(ICLR), pp 1–14
Svoboda J, Anoosheh A, Osendorfer C, Masci J (2020) Two-stage peer-regularized feature recombination for arbitrary image style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 13816–13825
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 1–9
Wang Z, Zhao L, Chen H, Qiu L, Mo Q, Lin S, Xing W, Lu D (2020) Diversified arbitrary style transfer via deep feature perturbation. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 7789–7798
Wu W, Cao K, Li C, Qian C, Loy CC (2019) Disentangling content and style via unsupervised geometry distillation. arXiv:190504538
Xia X, Zhang M, Xue T, Sun Z, Fang H, Kulis B, Chen J (2020) Joint bilateral learning for real-time universal photorealistic style transfer. In: Proceedings of the IEEE European conference on computer vision (ECCV), pp 327–342
Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 1492–1500
Yeh YR, Lin TC, Chung YY, Wang YCF (2012) A novel multiple kernel learning framework for heterogeneous feature fusion and variable selection. IEEE Transactions on Multimedia 14(3):563–574
Yoo J, Uh Y, Chun S, Kang B, Ha JW (2019) Photorealistic style transfer via wavelet transforms. In: Proceedings of the IEEE international conference on computer vision(ICCV), pp 9036–9045
Zamir SW, Arora A, Khan S, Hayat M, Khan FS, Yang MH, Shao L (2020) Learning enriched features for real image restoration and enhancement. In: Proceedings of the IEEE European conference on computer vision (ECCV), pp 492–511
Zhang H, Wu C, Zhang Z, Zhu Y, Lin H, Zhang Z, Sun Y, He T, Mueller J, Manmatha R, et al. (2020) Resnest: Split-attention networks. arXiv:200408955
Zhang Y, Zhang Y, Cai W (2018) Separating style and content for generalized style transfer. In: Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), pp 8447–8455
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Huo, Z., Li, X., Qiao, Y. et al. Efficient photorealistic style transfer with multi-order image statistics. Appl Intell 52, 12533–12545 (2022). https://doi.org/10.1007/s10489-021-03154-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-03154-z