Abstract
For the subject of arbitrary image style transfer, there have been some proposed architectures that directly compute the transformation matrix of the whitening and coloring transformation (WCT) to obtain more satisfactory transformation results. However, calculating the transformation matrix of WCT is time-consuming. Li et al. trained a linear transformation module to generate a WCT transformation matrix for any pair of images, i.e., content image and style image, to avoid complex calculations and improves time efficiency. In this work, we introduce a flexible arbitrary image style transfer framework based on the LST, which uses deep neural networks to train a linear transformation matrix as the standard matrix for WCT. For the first part, inverse relationship between the Whitening matrix and the Coloring matrix w.r.t. the same image is enforced during the training of the linear transformation matrix, so that the resulting matrix will be more accurate and closer to the standard matrix of WCT. For the second part, a split-and-transform scheme is proposed. Unlike LST, which transforms the block of feature maps as a whole, the split-and-transform scheme divides the feature block into several smaller blocks and transforms them individually, so that the transformation is more localized, and the more the number of divided blocks, the more localized. In addition, the proposed split-and-transform scheme allows users to determine the number of divided blocks to flexibly control the locality of the transformations. Experimental results demonstrate the effectiveness and flexibility of the proposed framework by the high-quality stylized images and adjustable balance between globality and locality of transformations. The use of the split-and-transform scheme can reduce the computational time while preserving or even improving the stylization results.
Similar content being viewed by others
Code availability
“Not applicable”.
References
Wang TC, Liu MY, Zhu JY, Tao A, Kautz J, Catanzaro B (Aug., 2018) High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. arXiv:1711.11585 [cs.CV]
Miyato T, Koyama M (Aug., 2018) CGANs with Projection Discriminator. arXiv:1802.05637 [cs.LG]
Zhu JY, Zhang R, Pathak D, Darrell T, Efros AA, Wang O, Shechtman E (Oct., 2018) Toward Multimodal Image-to-Image Translation. arXiv:1711.11586 [cs.CV]
Zhu JY, Park T, Isola P, Efros AA (Nov., 2018) Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. arXiv:1703.10593 [cs.CV]
Isola P, Zhu JY, Zhou T, Efros AA (Nov., 2018) Image-to-Image Translation with Conditional Adversarial Networks. arXiv:1611.07004 [cs.CV]
Park T, Liu MY, Wang TC, Zhu JY (Nov., 2019) Semantic Image Synthesis with Spatially-Adaptive Normalization. arXiv:1903.07291 [cs.CV]
Kotovenko D, Sanakoyeu A, Ma P, Lang S, Ommer B (Mar., 2020) A Content Transformation Block for Image Style Transfer. arXiv:2003.08407 [cs.CV]
Gatys LA, Ecker AS, Bethge M (Sept., 2015) A Neural Algorithm of Artistic Style. arXiv:1508.06576 [cs.CV]
Johnson J, Alahi A, Li FF (Mar., 2016) Perceptual Losses for Real-Time Style Transfer and Super-Resolution. arXiv:1603.08155 [cs.CV]
Li Y, Fang C, Yang J, Wang Z, Lu X, Yang MH (Nov., 2017) Universal Style Transfer via Feature Transforms. arXiv:1705.08086 [cs.CV]
Xu Z, Wilber M, Fang C, Hertzmann A, Jin H (May, 2018) Learning from Multi-Domain Artistic Images for Arbitrary Style Transfer. arXiv:1805.09987 [cs.CV]
Sheng L, Lin Z, Shao J, Wang X (May, 2018) Avatar-Net Multi-Scale Zero-Shot Style Transfer by Feature Decoration. arXiv:1805.03857 [cs.CV]
Ulyanov D, Lebedev V, Vedaldi A, Lempitsky VS (2016) Texture networks: Feed-forward synthesis of textures and stylized images. In: Proceedings of the 33nd International Conference on Machine Learning, June 19–24, pp. 1349–1357
Huang X, Belongie S (2017) Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization. arXiv:1703.06868v2 [cs.CV]
Ulyanov D, Vedaldi A, Lempitsky V (2016)Instance Normalization: The Missing Ingredient for Fast Stylization. arXiv:1607.08022v3 [cs.CV]
Ulyanov D, Vedaldi A, Lempitsky V (2017) Improved texture networks: Maximizing quality and diversity in feed-forward stylization and texture synthesis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6924–6932
Li X, Liu S, Kautz J, Yang MH (Aug., 2018) Learning Linear Transformations for Fast Arbitrary Style Transfer. arXiv:1808.04537 [cs.CV]
Park DY, Lee KH (May, 2019) Arbitrary Style Transfer with Style-Attentional Networks. arXiv: 1812.02342 [cs.CV]
An J, Xiong H, Luo J, Huan J, Ma J (Jul., 2019) Fast Universal Style Transfer for Artistic and Photorealistic Rendering. arXiv:1907.03118v1 [cs.CV]
Nguyen AD, Choi S, Kim W, Lee S (May, 2019) A Simple Way of Multimodal and Arbitrary Style Transfer. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Wang Z, Zhao L, Chen H, Qiu L, Mo Q, Lin S, Xing W, Lu D (Nov., 2019) Diversified Arbitrary Style Transfer via Deep Feature Perturbation. arXiv:1909.08223 [cs.CV]
Zhang Y, Fang C, Wang Y, Wang Z, Lin Z, Fu Y, Yang J (Jan., 2020) Multimodal Style Transfer via Graph Cuts. arXiv:1904.04443 [cs.CV]
An J, Xiong H, Huan J, Luo J (2020) Ultrafast Photorealistic Style Transfer via Neural Architecture Search. arXiv:1912.02398v2 [cs.CV]
Dumoulin V, Shlens J, Kudlur M (2017) A learned representation for artistic style. arXiv:1610.07629v5 [cs.CV]
Chen D, Yuan L, Liao J, Yu N, Hua G (Mar., 2017) Stylebank: An explicit representation for neural image style transfer. CVPR
Zhang H, Dana K Multi-style generative network for real-time transfer. arXiv:1703.06953v2 [cs.CV]. https://doi.org/10.48550/arXiv.1703.06953
JinY, Yang Y, Feng Z, Ye J, Yu Y, Song M (Oct., 2018) Neural Style Transfer: A Review. arXiv:1705.04058 [cs.CV]
Simonyan K, Zisserman A (April, 2015) Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556v6 [cs.CV]
Deng J, Dong W, Socher R, Li L-J, Li K, Li F-F (2009) Imagenet: A Large-Scale Hierarchical Image Database. In: CVPR
Lin TY, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollr P (2014) Microsoft coco: Common objects in context. In: ECCV.
Nichol K (2016) Painter by numbers, wikiart. https://www.kaggle.com/c/painter-by-numbers
Shaha B, Bhavsarb H (2022) Time Complexity in Deep Learning Models. Procedia Computer Science 215:202–210
Acknowledgements
This work was supported by the Ministry of Science and Technology, Taiwan, R.O.C. under the grant MOST-109-2221-E-032-029.
Funding
Author Hwei Jen Lin has received a grant from Ministry of Science and Technology, Taiwan; the other authors declare they have no financial interests.
Author information
Authors and Affiliations
Contributions
Ching-Ting Tu, Hwei Jen Lin, Yihjia Tsai, and Zi-Jun Lin C.-T. Tu: Developed methodology and co-wrote the paper. H. J. Lin: Developed methodology and co-wrote the paper. Y. Tsai: Developed methodology and algorithms. Z.-J. Lin: Collected data, developed computer programs and performed experiments.
Corresponding author
Ethics declarations
Ethics approval
“Not applicable”.
Consent to participate
“Not applicable”.
Consent for publication
“Not applicable”.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Tu, CT., Lin, H.J., Tsai, Y. et al. Arbitrary style transfer system with split-and-transform scheme. Multimed Tools Appl 83, 62497–62517 (2024). https://doi.org/10.1007/s11042-023-16582-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16582-5