Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/978-3-031-19790-1_17guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Improving the Perceptual Quality of 2D Animation Interpolation

Published: 23 October 2022 Publication History

Abstract

Traditional 2D animation is labor-intensive, often requiring animators to manually draw twelve illustrations per second of movement. While automatic frame interpolation may ease this burden, 2D animation poses additional difficulties compared to photorealistic video. In this work, we address challenges unexplored in previous animation interpolation systems, with a focus on improving perceptual quality. Firstly, we propose SoftsplatLite (SSL), a forward-warping interpolation architecture with fewer trainable parameters and better perceptual performance. Secondly, we design a Distance Transform Module (DTM) that leverages line proximity cues to correct aberrations in difficult solid-color regions. Thirdly, we define a Restricted Relative Linear Discrepancy metric (RRLD) to automate the previously manual training data collection process. Lastly, we explore evaluation of 2D animation generation through a user study, and establish that the LPIPS perceptual metric and chamfer line distance (CD) are more appropriate measures of quality than PSNR and SSIM used in prior art.

References

[1]
Bao, W., Lai, W.S., Ma, C., Zhang, X., Gao, Z., Yang, M.H.: Depth-aware video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3703–3712 (2019)
[2]
Bao W, Lai WS, Zhang X, Gao Z, and Yang MH MEMC-Net: motion estimation and motion compensation driven neural network for video interpolation and enhancement IEEE Trans. Pattern Anal. Mach. Intell. 2019 43 933-948
[3]
Blau, Y., Michaeli, T.: The perception-distortion tradeoff. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6228–6237 (2018)
[4]
Cao, T.T., Tang, K., Mohamed, A., Tan, T.S.: Parallel banding algorithm to compute exact distance transform with the GPU. In: Proceedings of the 2010 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, pp. 83–90 (2010)
[5]
Casey, E., Pérez, V., Li, Z.: The animation transformer: visual correspondence via segment matching. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11323–11332 (2021)
[6]
Choi, M., Kim, H., Han, B., Xu, N., Lee, K.M.: Channel attention is all you need for video frame interpolation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10663–10671 (2020)
[7]
Dalstein B, Ronfard R, and Van De Panne M Vector graphics animation with time-varying topology ACM Trans. Graph. (TOG) 2015 34 4 1-12
[8]
Falcon, W., The PyTorch Lightning team: PyTorch Lightning (2019). https://github.com/PyTorchLightning/pytorch-lightning
[9]
Felzenszwalb PF and Huttenlocher DP Distance transforms of sampled functions Theory Comput. 2012 8 1 415-428
[10]
Fourure, D., Emonet, R., Fromont, E., Muselet, D., Tremeau, A., Wolf, C.: Residual conv-deconv grid network for semantic segmentation. arXiv preprint arXiv:1707.07958 (2017)
[11]
Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231 (2018)
[12]
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
[13]
Huang, Z., Zhang, T., Heng, W., Shi, B., Zhou, S.: Rife: real-time intermediate flow estimation for video frame interpolation. arXiv preprint arXiv:2011.06294 (2020)
[14]
Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2462–2470 (2017)
[15]
Jaegle, A., et al.: Perceiver IO: a general architecture for structured inputs & outputs. arXiv preprint arXiv:2107.14795 (2021)
[16]
Jiang, H., Sun, D., Jampani, V., Yang, M.H., Learned-Miller, E., Kautz, J.: Super SloMo: high quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9000–9008 (2018)
[17]
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
[18]
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)
[19]
Liu, L., et al.: Learning by analogy: reliable supervision from transformations for unsupervised optical flow estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6489–6498 (2020)
[20]
Maejima, A., et al.: Anime character colorization using few-shot learning. In: SIGGRAPH Asia 2021 Technical Communications, pp. 1–4 (2021)
[21]
Meyer, S., Djelouah, A., McWilliams, B., Sorkine-Hornung, A., Gross, M., Schroers, C.: PhaseNet for video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 498–507 (2018)
[22]
Meyer, S., Wang, O., Zimmer, H., Grosse, M., Sorkine-Hornung, A.: Phase-based frame interpolation for video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1410–1418 (2015)
[23]
Narita, R., Hirakawa, K., Aizawa, K.: Optical flow based line drawing frame interpolation using distance transform to support inbetweenings. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 4200–4204. IEEE (2019)
[24]
Niklaus, S., Liu, F.: Softmax splatting for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5437–5446 (2020)
[25]
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 670–679 (2017)
[26]
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive separable convolution. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 261–270 (2017)
[27]
Okuta, R., Unno, Y., Nishino, D., Hido, S., Loomis, C.: CuPy: a NumPy-compatible library for NVIDIA GPU calculations. In: Proceedings of Workshop on Machine Learning Systems (LearningSys) in the Thirty-First Annual Conference on Neural Information Processing Systems (NIPS) (2017)
[28]
Park, J., Lee, C., Kim, C.S.: Asymmetric bilateral motion estimation for video frame interpolation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14539–14548 (2021)
[29]
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32, pp. 8026–8037 (2019)
[30]
Qian, Z., Bo, W., Wei, W., Hai, L., Hui, L.J.: Line art correlation matching network for automatic animation colorization. arXiv e-prints, pp. arXiv-2004 (2020)
[31]
Ren H, Li J, and Gao N Two-stage sketch colorization with color parsing IEEE Access 2019 8 44599-44610
[32]
Riba, E., Mishkin, D., Shi, J., Ponsa, D., Moreno-Noguer, F., Bradski, G.: A survey on Kornia: an open source differentiable computer vision library for Pytorch (2020)
[33]
Ronneberger O, Fischer P, and Brox T Navab N, Hornegger J, Wells WM, and Frangi AF U-Net: convolutional networks for biomedical image segmentation Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 2015 Cham Springer 234-241
[34]
Sampat MP, Wang Z, Gupta S, Bovik AC, and Markey MK Complex wavelet structural similarity: a new image similarity index IEEE Trans. Image Process. 2009 18 11 2385-2401
[35]
Simo-Serra E, Iizuka S, and Ishikawa H Mastering sketching: adversarial augmentation for structured prediction ACM Trans. Graph. (TOG) 2018 37 1 1-13
[36]
Simo-Serra E, Iizuka S, Sasaki K, and Ishikawa H Learning to simplify: fully convolutional networks for rough sketch cleanup ACM Trans. Graph. (TOG) 2016 35 4 1-11
[37]
Siyao, L., et al.: Deep animation video interpolation in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6587–6595 (2021)
[38]
Souček, T., Lokoč, J.: TransNet v2: an effective deep network architecture for fast shot transition detection. arXiv preprint arXiv:2008.04838 (2020)
[39]
Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8934–8943 (2018)
[40]
Teed Z and Deng J Vedaldi A, Bischof H, Brox T, and Frahm J-M RAFT: recurrent all-pairs field transforms for optical flow Computer Vision – ECCV 2020 2020 Cham Springer 402-419
[41]
Virtanen P et al. SciPy 1.0: fundamental algorithms for scientific computing in Python Nat. Methods 2020 17 261-272
[42]
Whited, B., Noris, G., Simmons, M., Sumner, R.W., Gross, M., Rossignac, J.: BetweenIT: an interactive tool for tight inbetweening. In: Computer Graphics Forum, vol. 29, pp. 605–614. Wiley Online Library (2010)
[43]
Xu, X., Siyao, L., Sun, W., Yin, Q., Yang, M.H.: Quadratic video interpolation. arXiv preprint arXiv:1911.00627 (2019)
[44]
Yagi, Y.: A filter based approach for inbetweening. arXiv preprint arXiv:1706.03497 (2017)
[45]
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVII
Oct 2022
799 pages
ISBN:978-3-031-19789-5
DOI:10.1007/978-3-031-19790-1

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 23 October 2022

Author Tags

  1. Animation
  2. Video frame interpolation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Skeleton-Driven Inbetweening of Bitmap Character DrawingsACM Transactions on Graphics10.1145/368795543:6(1-19)Online publication date: 19-Dec-2024
  • (2024)LVCD: Reference-based Lineart Video Colorization with Diffusion ModelsACM Transactions on Graphics10.1145/368791043:6(1-11)Online publication date: 19-Dec-2024
  • (2024)Joint Stroke Tracing and Correspondence for 2D AnimationACM Transactions on Graphics10.1145/364989043:3(1-17)Online publication date: 9-Apr-2024
  • (2024)Revitalizing Traditional Animation: Pre-Composite Frame Interpolation as a Production CatalystACM SIGGRAPH 2024 Posters10.1145/3641234.3671038(1-2)Online publication date: 25-Jul-2024
  • (2024)Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame InterpolationComputer Vision – ECCV 202410.1007/978-3-031-73414-4_20(346-363)Online publication date: 29-Sep-2024
  • (2023)Collaborative neural rendering using anime character sheetsProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/646(5824-5832)Online publication date: 19-Aug-2023
  • (2023)Video frame interpolation with densely queried bilateral correlationProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/198(1786-1794)Online publication date: 19-Aug-2023
  • (2022)AnimeRunProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601650(18996-19007)Online publication date: 28-Nov-2022

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media