Article

Improving the Perceptual Quality of 2D Animation Interpolation

Authors:

Matthias ZwickerAuthors Info & Claims

Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVII

Pages 271 - 287

https://doi.org/10.1007/978-3-031-19790-1_17

Published: 23 October 2022 Publication History

Abstract

Traditional 2D animation is labor-intensive, often requiring animators to manually draw twelve illustrations per second of movement. While automatic frame interpolation may ease this burden, 2D animation poses additional difficulties compared to photorealistic video. In this work, we address challenges unexplored in previous animation interpolation systems, with a focus on improving perceptual quality. Firstly, we propose SoftsplatLite (SSL), a forward-warping interpolation architecture with fewer trainable parameters and better perceptual performance. Secondly, we design a Distance Transform Module (DTM) that leverages line proximity cues to correct aberrations in difficult solid-color regions. Thirdly, we define a Restricted Relative Linear Discrepancy metric (RRLD) to automate the previously manual training data collection process. Lastly, we explore evaluation of 2D animation generation through a user study, and establish that the LPIPS perceptual metric and chamfer line distance (CD) are more appropriate measures of quality than PSNR and SSIM used in prior art.

References

[1]

Bao, W., Lai, W.S., Ma, C., Zhang, X., Gao, Z., Yang, M.H.: Depth-aware video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3703–3712 (2019)

[2]

Bao W, Lai WS, Zhang X, Gao Z, and Yang MH MEMC-Net: motion estimation and motion compensation driven neural network for video interpolation and enhancement IEEE Trans. Pattern Anal. Mach. Intell. 2019 43 933-948

[3]

Blau, Y., Michaeli, T.: The perception-distortion tradeoff. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6228–6237 (2018)

[4]

Cao, T.T., Tang, K., Mohamed, A., Tan, T.S.: Parallel banding algorithm to compute exact distance transform with the GPU. In: Proceedings of the 2010 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, pp. 83–90 (2010)

[5]

Casey, E., Pérez, V., Li, Z.: The animation transformer: visual correspondence via segment matching. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11323–11332 (2021)

[6]

Choi, M., Kim, H., Han, B., Xu, N., Lee, K.M.: Channel attention is all you need for video frame interpolation. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 10663–10671 (2020)

[7]

Dalstein B, Ronfard R, and Van De Panne M Vector graphics animation with time-varying topology ACM Trans. Graph. (TOG) 2015 34 4 1-12

Digital Library

[8]

Falcon, W., The PyTorch Lightning team: PyTorch Lightning (2019). https://github.com/PyTorchLightning/pytorch-lightning

[9]

Felzenszwalb PF and Huttenlocher DP Distance transforms of sampled functions Theory Comput. 2012 8 1 415-428

[10]

Fourure, D., Emonet, R., Fromont, E., Muselet, D., Tremeau, A., Wolf, C.: Residual conv-deconv grid network for semantic segmentation. arXiv preprint arXiv:1707.07958 (2017)

[11]

Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F.A., Brendel, W.: ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231 (2018)

[12]

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

[13]

Huang, Z., Zhang, T., Heng, W., Shi, B., Zhou, S.: Rife: real-time intermediate flow estimation for video frame interpolation. arXiv preprint arXiv:2011.06294 (2020)

[14]

Ilg, E., Mayer, N., Saikia, T., Keuper, M., Dosovitskiy, A., Brox, T.: FlowNet 2.0: evolution of optical flow estimation with deep networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2462–2470 (2017)

[15]

Jaegle, A., et al.: Perceiver IO: a general architecture for structured inputs & outputs. arXiv preprint arXiv:2107.14795 (2021)

[16]

Jiang, H., Sun, D., Jampani, V., Yang, M.H., Learned-Miller, E., Kautz, J.: Super SloMo: high quality estimation of multiple intermediate frames for video interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 9000–9008 (2018)

[17]

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

[18]

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25, pp. 1097–1105 (2012)

[19]

Liu, L., et al.: Learning by analogy: reliable supervision from transformations for unsupervised optical flow estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6489–6498 (2020)

[20]

Maejima, A., et al.: Anime character colorization using few-shot learning. In: SIGGRAPH Asia 2021 Technical Communications, pp. 1–4 (2021)

[21]

Meyer, S., Djelouah, A., McWilliams, B., Sorkine-Hornung, A., Gross, M., Schroers, C.: PhaseNet for video frame interpolation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 498–507 (2018)

[22]

Meyer, S., Wang, O., Zimmer, H., Grosse, M., Sorkine-Hornung, A.: Phase-based frame interpolation for video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1410–1418 (2015)

[23]

Narita, R., Hirakawa, K., Aizawa, K.: Optical flow based line drawing frame interpolation using distance transform to support inbetweenings. In: 2019 IEEE International Conference on Image Processing (ICIP), pp. 4200–4204. IEEE (2019)

[24]

Niklaus, S., Liu, F.: Softmax splatting for video frame interpolation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5437–5446 (2020)

[25]

Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 670–679 (2017)

[26]

Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive separable convolution. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 261–270 (2017)

[27]

Okuta, R., Unno, Y., Nishino, D., Hido, S., Loomis, C.: CuPy: a NumPy-compatible library for NVIDIA GPU calculations. In: Proceedings of Workshop on Machine Learning Systems (LearningSys) in the Thirty-First Annual Conference on Neural Information Processing Systems (NIPS) (2017)

[28]

Park, J., Lee, C., Kim, C.S.: Asymmetric bilateral motion estimation for video frame interpolation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14539–14548 (2021)

[29]

Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32, pp. 8026–8037 (2019)

[30]

Qian, Z., Bo, W., Wei, W., Hai, L., Hui, L.J.: Line art correlation matching network for automatic animation colorization. arXiv e-prints, pp. arXiv-2004 (2020)

[31]

Ren H, Li J, and Gao N Two-stage sketch colorization with color parsing IEEE Access 2019 8 44599-44610

[32]

Riba, E., Mishkin, D., Shi, J., Ponsa, D., Moreno-Noguer, F., Bradski, G.: A survey on Kornia: an open source differentiable computer vision library for Pytorch (2020)

[33]

Ronneberger O, Fischer P, and Brox T Navab N, Hornegger J, Wells WM, and Frangi AF U-Net: convolutional networks for biomedical image segmentation Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015 2015 Cham Springer 234-241

[34]

Sampat MP, Wang Z, Gupta S, Bovik AC, and Markey MK Complex wavelet structural similarity: a new image similarity index IEEE Trans. Image Process. 2009 18 11 2385-2401

Digital Library

[35]

Simo-Serra E, Iizuka S, and Ishikawa H Mastering sketching: adversarial augmentation for structured prediction ACM Trans. Graph. (TOG) 2018 37 1 1-13

Digital Library

[36]

Simo-Serra E, Iizuka S, Sasaki K, and Ishikawa H Learning to simplify: fully convolutional networks for rough sketch cleanup ACM Trans. Graph. (TOG) 2016 35 4 1-11

Digital Library

[37]

Siyao, L., et al.: Deep animation video interpolation in the wild. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6587–6595 (2021)

[38]

Souček, T., Lokoč, J.: TransNet v2: an effective deep network architecture for fast shot transition detection. arXiv preprint arXiv:2008.04838 (2020)

[39]

Sun, D., Yang, X., Liu, M.Y., Kautz, J.: PWC-Net: CNNs for optical flow using pyramid, warping, and cost volume. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8934–8943 (2018)

[40]

Teed Z and Deng J Vedaldi A, Bischof H, Brox T, and Frahm J-M RAFT: recurrent all-pairs field transforms for optical flow Computer Vision – ECCV 2020 2020 Cham Springer 402-419

Digital Library

[41]

Virtanen P et al. SciPy 1.0: fundamental algorithms for scientific computing in Python Nat. Methods 2020 17 261-272

[42]

Whited, B., Noris, G., Simmons, M., Sumner, R.W., Gross, M., Rossignac, J.: BetweenIT: an interactive tool for tight inbetweening. In: Computer Graphics Forum, vol. 29, pp. 605–614. Wiley Online Library (2010)

[43]

Xu, X., Siyao, L., Sun, W., Yin, Q., Yang, M.H.: Quadratic video interpolation. arXiv preprint arXiv:1911.00627 (2019)

[44]

Yagi, Y.: A filter based approach for inbetweening. arXiv preprint arXiv:1706.03497 (2017)

[45]

Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)

Cited By

Brodt KBessmeltsev M(2024)Skeleton-Driven Inbetweening of Bitmap Character DrawingsACM Transactions on Graphics10.1145/368795543:6(1-19)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687955
Huang ZZhang MLiao J(2024)LVCD: Reference-based Lineart Video Colorization with Diffusion ModelsACM Transactions on Graphics10.1145/368791043:6(1-11)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687910
Mo HGao CWang R(2024)Joint Stroke Tracing and Correspondence for 2D AnimationACM Transactions on Graphics10.1145/364989043:3(1-17)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3649890
Show More Cited By

Recommendations

Artist friendly facial animation retargeting

This paper presents a novel facial animation retargeting system that is carefully designed to support the animator's workflow. Observation and analysis of the animators' often preferred process of key-frame animation with blendshape models informed our ...
Artist friendly facial animation retargeting
SA '11: Proceedings of the 2011 SIGGRAPH Asia Conference

This paper presents a novel facial animation retargeting system that is carefully designed to support the animator's workflow. Observation and analysis of the animators' often preferred process of key-frame animation with blendshape models informed our ...
Automatic rigging and animation of 3D characters
SIGGRAPH '07: ACM SIGGRAPH 2007 papers

Animating an articulated 3D character currently requires manual rigging to specify its internal skeletal structure and to define how the input motion deforms its surface. We present a method for animating characters automatically. Given a static ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XVII

Oct 2022

799 pages

ISBN:978-3-031-19789-5

DOI:10.1007/978-3-031-19790-1

Editors:
Shai Avidan
Tel Aviv University, Tel Aviv, Israel
,
Gabriel Brostow
University College London, London, UK
,
Moustapha Cissé
Google AI, Accra, Ghana
,
Giovanni Maria Farinella
University of Catania, Catania, Italy
,
Tal Hassner
Facebook (United States), Menlo Park, CA, USA

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 23 October 2022

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Brodt KBessmeltsev M(2024)Skeleton-Driven Inbetweening of Bitmap Character DrawingsACM Transactions on Graphics10.1145/368795543:6(1-19)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687955
Huang ZZhang MLiao J(2024)LVCD: Reference-based Lineart Video Colorization with Diffusion ModelsACM Transactions on Graphics10.1145/368791043:6(1-11)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687910
Mo HGao CWang R(2024)Joint Stroke Tracing and Correspondence for 2D AnimationACM Transactions on Graphics10.1145/364989043:3(1-17)Online publication date: 9-Apr-2024
https://dl.acm.org/doi/10.1145/3649890
Brown SBourgeois A(2024)Revitalizing Traditional Animation: Pre-Composite Frame Interpolation as a Production CatalystACM SIGGRAPH 2024 Posters10.1145/3641234.3671038(1-2)Online publication date: 25-Jul-2024
https://dl.acm.org/doi/10.1145/3641234.3671038
Zhong ZKrishnan GSun XQiao YMa SWang J(2024)Clearer Frames, Anytime: Resolving Velocity Ambiguity in Video Frame InterpolationComputer Vision – ECCV 202410.1007/978-3-031-73414-4_20(346-363)Online publication date: 29-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-73414-4_20
Lin ZHuang AHuang ZElkind E(2023)Collaborative neural rendering using anime character sheetsProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/646(5824-5832)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.24963/ijcai.2023/646
Zhou CLiu JTang JWu GElkind E(2023)Video frame interpolation with densely queried bilateral correlationProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/198(1786-1794)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.24963/ijcai.2023/198
Siyao LLi YLi BDong CLiu ZLoy CKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)AnimeRunProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601650(18996-19007)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601650

View Options

View options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents