research-article

VideoDoodles: Hand-Drawn Animations on Videos with Scene-Aware Canvases

Authors:

Kevin Blackburn-Matzen,

Rubaiat Habib Kazi,

Adrien BousseauAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 42, Issue 4

Article No.: 54, Pages 1 - 12

https://doi.org/10.1145/3592413

Published: 26 July 2023 Publication History

Abstract

We present an interactive system to ease the creation of so-called video doodles - videos on which artists insert hand-drawn animations for entertainment or educational purposes. Video doodles are challenging to create because to be convincing, the inserted drawings must appear as if they were part of the captured scene. In particular, the drawings should undergo tracking, perspective deformations and occlusions as they move with respect to the camera and to other objects in the scene - visual effects that are difficult to reproduce with existing 2D video editing software. Our system supports these effects by relying on planar canvases that users position in a 3D scene reconstructed from the video. Furthermore, we present a custom tracking algorithm that allows users to anchor canvases to static or dynamic objects in the scene, such that the canvases move and rotate to follow the position and direction of these objects. When testing our system, novices could create a variety of short animated clips in a dozen of minutes, while professionals praised its speed and ease of use compared to existing tools.

Supplementary Material

ZIP File (papers_558-supplemental.zip)

supplemental material

Download
403.78 MB

MP4 File (papers_558_VOD.mp4)

presentation

Download
170.93 MB

References

[1]

Adobe. 2022. After Effects. https://www.adobe.com/products/aftereffects.html.

[2]

Aseem Agarwala, Aaron Hertzmann, David H Salesin, and Steven M Seitz. 2004. Keyframe-based tracking for rotoscoping and animation. ACM Transactions on Graphics (Proc. SIGGRAPH) 23, 3 (2004).

[3]

Brian Amberg and Thomas Vetter. 2011. GraphTrack: Fast and globally optimal tracking in videos. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Digital Library

[4]

Apple. 2022. ARKit. https://developer.apple.com/augmented-reality/arkit/.

[5]

Rahul Arora, Rubaiat Habib Kazi, Tovi Grossman, George Fitzmaurice, and Karan Singh. 2018. SymbiosisSketch: Combining 2D & 3D Sketching for Designing Detailed 3D Objects in Situ. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems.

Digital Library

[6]

Seok-Hyung Bae, Ravin Balakrishnan, and Karan Singh. 2008. ILoveSketch: as-natural-as-possible sketching system for creating 3d curve models. In ACM Symposium on User Interface Software and Technology (UIST).

Digital Library

[7]

Zhangxing Bian, Allan Jabri, Alexei A. Efros, and Andrew Owens. 2022. Learning Pixel Trajectories with Multiscale Contrastive Random Walks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]

Blender. 2022a. Blender Grease Pencil. https://www.blender.org/features/story-artist/.

[9]

Blender. 2022b. Blender Motion Tracking. https://docs.blender.org/manual/en/latest/movie_clip/tracking/index.html.

[10]

BorisFX. 2022. Mocha Pro. https://borisfx.com/products/mocha-pro/.

[11]

Nicolas Boumal. 2013. Interpolation and regression of rotation matrices. In International Conference on Geometric Science of Information. Springer, 345--352.

[12]

Aeron Buchanan and Andrew Fitzgibbon. 2006. Interactive feature tracking using kd trees and dynamic programming. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]

Mental Canvas. 2022. Mental Canvas Application. https://mentalcanvas.com/.

[14]

Z. Cao, G. Hidalgo Martinez, T. Simon, S. Wei, and Y. A. Sheikh. 2019. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019).

[15]

Carl Doersch, Ankush Gupta, Larisa Markeeva, Adria Recasens Continente, Kucas Smaira, Yusuf Aytar, Joao Carreira, Andrew Zisserman, and Yi Yang. 2022. TAP-Vid: A Benchmark for Tracking Any Point in a Video. In NeurIPS Datasets Track.

[16]

Julie Dorsey, Songhua Xu, Gabe Smedresman, Holly Rushmeier, and Leonard McMillan. 2007. The mental canvas: A tool for conceptual architectural design and analysis. In Pacific Conference on Computer Graphics and Applications.

Digital Library

[17]

Pierre Dragicevic, Gonzalo Ramos, Jacobo Bibliowitcz, Derek Nowrouzezahrai, Ravin Balakrishnan, and Karan Singh. 2008. Video browsing by direct manipulation. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems.

Digital Library

[18]

Ruofei Du, Eric Turner, Maksym Dzitsiuk, Luca Prasso, Ivo Duarte, Jason Dourgarian, Joao Afonso, Jose Pascoal, Josh Gladstone, Nuno Cruces, et al. 2020. DepthLab: Real-time 3D interaction with depth maps for mobile augmented reality. In Proc. ACM Symposium on User Interface Software and Technology (UIST).

Digital Library

[19]

Foundry. 2022. Nuke. https://www.foundry.com/products/nuke-family/nuke.

[20]

Dan B Goldman, Brian Curless, David Salesin, and Steven M Seitz. 2006. Schematic storyboarding for video visualization and editing. ACM Transactions on Graphics (Proc. SIGGRAPH) 25, 3 (2006).

[21]

Dan B Goldman, Chris Gonterman, Brian Curless, David Salesin, and Steven M Seitz. 2008. Video object annotation, navigation, and composition. In Proc. ACM symposium on User Interface Software and Technology (UIST). 3--12.

Digital Library

[22]

Rıza Alp Güler, Natalia Neverova, and Iasonas Kokkinos. 2018. Densepose: Dense human pose estimation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7297--7306.

[23]

João F Henriques, Rui Caseiro, Pedro Martins, and Jorge Batista. 2014. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 3 (2014), 583--596.

Digital Library

[24]

Eric Horvitz. 1999. Principles of mixed-initiative user interfaces. In Proc. ACM SIGCHI conference on Human Factors in Computing Systems.

Digital Library

[25]

Riwano Ikeda and Issei Fujishiro. 2021. SpiCa: Stereoscopic Effect Design with 3D Pottery Wheel-Type Transparent Canvas. In ACM SIGGRAPH Asia 2021 Technical Communications.

[26]

Allan Jabri, Andrew Owens, and Alexei A Efros. 2020. Space-Time Correspondence as a Contrastive Random Walk. Advances in Neural Information Processing Systems (2020).

[27]

Yoni Kasten, Dolev Ofri, Oliver Wang, and Tali Dekel. 2021. Layered neural atlases for consistent video editing. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 40, 6 (2021), 1--12.

Digital Library

[28]

Rubaiat Habib Kazi, Fanny Chevalier, Tovi Grossman, Shengdong Zhao, and George Fitzmaurice. 2014. Draco: Bringing Life to Illustrations with Kinetic Textures. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems.

Digital Library

[29]

KenTools. 2022. GeoTracker. https://keentools.io/products/geotracker-for-after-effects.

[30]

Felix Klose, Oliver Wang, Jean-Charles Bazin, Marcus Magnor, and Alexander Sorkine-Hornung. 2015. Sampling based scene-space video processing. ACM Transactions on Graphics (Proc. SIGGRAPH) 34, 4 (2015), 1--11.

Digital Library

[31]

Johannes Kopf, Michael F. Cohen, and Richard Szeliski. 2014. First-person Hyper-lapse videos. ACM Transactions on Graphics (Proc. SIGGRAPH) 33, 4 (2014).

[32]

Johannes Kopf, Kevin Matzen, Suhib Alsisan, Ocean Quigley, Francis Ge, Yangming Chong, Josh Patterson, Jan-Michael Frahm, Shu Wu, Matthew Yu, et al. 2020. One shot 3D photography. ACM Transactions on Graphics (Proc. SIGGRAPH) 39, 4 (2020).

[33]

Johannes Kopf, Xuejian Rong, and Jia-Bin Huang. 2021. Robust Consistent Video Depth Estimation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]

Germán Leiva, Cuong Nguyen, Rubaiat Habib Kazi, and Paul Asente. 2020. Pronto: Rapid augmented reality video prototyping using sketches and enaction. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems. 1--13.

Digital Library

[35]

Wenbin Li, Fabio Viola, Jonathan Starck, Gabriel J Brostow, and Neill DF Campbell. 2016. Roto++ accelerating professional rotoscoping using shape manifolds. ACM Transactions on Graphics (Proc. SIGGRAPH) 35, 4 (2016).

[36]

Yuwei Li, Xi Luo, Youyi Zheng, Pengfei Xu, and Hongbo Fu. 2017. SweepCanvas: Sketch-based 3D prototyping on an RGB-D image. In Proc. ACM Symposium on User Interface Software and Technology (UIST). 387--399.

Digital Library

[37]

Pengpeng Liang, Yifan Wu, Hu Lu, Liming Wang, Chunyuan Liao, and Haibin Ling. 2018. Planar object tracking in the wild: A benchmark. In IEEE International Conference on Robotics and Automation (ICRA). IEEE.

Digital Library

[38]

Jian Liao, Adnan Karim, Shivesh Singh Jadon, Rubaiat Habib Kazi, and Ryo Suzuki. 2022. RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling. In Proc. ACM Symposium on User Interface Software and Technology (UIST). Article 17, 12 pages.

Digital Library

[39]

Jingyuan Liu, Hongbo Fu, and Chiew-Lan Tai. 2020. Posetween: Pose-driven tween animation. In Proc. ACM Symposium on User Interface Software and Technology (UIST).

Digital Library

[40]

Shaowei Liu, Subarna Tripathi, Somdeb Majumdar, and Xiaolong Wang. 2022b. Joint hand motion and interaction hotspots prediction from egocentric videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3282--3292.

[41]

Sean J Liu, Maneesh Agrawala, Stephen DiVerdi, and Aaron Hertzmann. 2022a. ZoomShop: Depth-Aware Editing of Photographic Composition. In Computer Graphics Forum, Vol. 41. Wiley Online Library, 57--70.

[42]

Xuan Luo, Jia-Bin Huang, Richard Szeliski, Kevin Matzen, and Johannes Kopf. 2020. Consistent video depth estimation. ACM Transactions on Graphics (Proc. SIGGRAPH) 39, 4 (2020), 71--1.

Digital Library

[43]

Maximilian Mayer, Philipp Trenz, Sebastian Pasewaldt, Mandy Klingbeil, Jürgen Döllner, Matthias Trapp, and Amir Semmo. 2021. MotionViz: Artistic Visualization of Human Motion on Mobile Devices. In ACM SIGGRAPH 2021 Appy Hour.

[44]

Raul Mur-Artal, Jose Maria Martinez Montiel, and Juan D Tardos. 2015. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Transactions on Robotics 31, 5 (2015).

Digital Library

[45]

Cuong Nguyen, Yuzhen Niu, and Feng Liu. 2013. Direct Manipulation Video Navigation in 3D. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems.

Digital Library

[46]

Seoung Wug Oh, Joon-Young Lee, Kalyan Sunkavalli, and Seon Joo Kim. 2018. Fast video object segmentation by reference-guided mask propagation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]

F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung. 2016. A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation. In Computer Vision and Pattern Recognition.

[48]

Jordi Pont-Tuset, Federico Perazzi, Sergi Caelles, Pablo Arbeláez, Alexander Sorkine-Hornung, and Luc Van Gool. 2017. The 2017 DAVIS Challenge on Video Object Segmentation. arXiv:1704.00675 (2017).

[49]

Alex Rav-Acha, Pushmeet Kohli, Carsten Rother, and Andrew Fitzgibbon. 2008. Unwrap mosaics: A new representation for video editing. ACM Transactions on Graphics (Proc. SIGGRAPH) (2008).

Digital Library

[50]

Runway. 2022. RunwayML. https://app.runwayml.com/.

[51]

Nazmus Saquib, Rubaiat Habib Kazi, Li-Yi Wei, and Wilmot Li. 2019. Interactive Body-Driven Graphics for Augmented Video Performance. In Proc. ACM CHI Conference on Human Factors in Computing Systems.

Digital Library

[52]

Ryan Schmidt, Azam Khan, Gord Kurtenbach, and Karan Singh. 2009. On Expert Performance in 3D Curve-Drawing Tasks. In Proc. Symposium on Sketch-Based Interfaces and Modeling (SBIM).

Digital Library

[53]

Johannes Lutz Schönberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]

Noah Snavely, C Lawrence Zitnick, Sing Bing Kang, and Michael Cohen. 2006. Stylizing 2.5-D video. In Proc. Symposium on Non-Photorealistic Animation and Rendering.

Digital Library

[55]

Tibor Stanko, Stefanie Hahmann, Georges-Pierre Bonneau, and Nathalie Saguin-Sprynski. 2017. Shape from sensors: Curve networks on surfaces from 3D orientations. Computers & Graphics (Proc. SMI) 66 (2017).

[56]

Qingkun Su, Xue Bai, Hongbo Fu, Chiew-Lan Tai, and Jue Wang. 2018. Live sketch: Video-driven dynamic deformation of static drawings. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems. 1--12.

Digital Library

[57]

Ryo Suzuki, Rubaiat Habib Kazi, Li-yi Wei, Stephen DiVerdi, Wilmot Li, and Daniel Leithinger. 2020. RealitySketch: Embedding Responsive Graphics and Visualizations in AR through Dynamic Sketching. In Proc. ACM Symposium on User Interface Software and Technology (UIST).

Digital Library

[58]

Zachary Teed and Jia Deng. 2020. RAFT: Recurrent all-pairs field transforms for optical flow. In European Conference on Computer Vision (ECCV). 402--419.

Digital Library

[59]

James Townsend, Niklas Koep, and Sebastian Weichwald. 2016. Pymanopt: A Python Toolbox for Optimization on Manifolds using Automatic Differentiation. Journal of Machine Learning Research 17, 137 (2016), 1--5.

Digital Library

[60]

Julien Valentin, Adarsh Kowdle, Jonathan T Barron, Neal Wadhwa, Max Dzitsiuk, Michael Schoenberg, Vivek Verma, Ambrus Csaszar, Eric Turner, Ivan Dryanovski, et al. 2018. Depth from motion for smartphone AR. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 37, 6 (2018), 1--19.

Digital Library

[61]

Nora S. Willett, Wilmot Li, Jovan Popovic, Floraine Berthouzoz, and Adam Finkelstein. 2017. Secondary Motion for Performed 2D Animation. In Proc. ACM Symposium on User Interface Software and Technology (UIST).

Digital Library

[62]

Yu Xiang, Tanner Schmidt, Venkatraman Narayanan, and Dieter Fox. 2018. PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes. Robotics: Science and Systems (RSS) (2018).

[63]

Xiuming Zhang, Tali Dekel, Tianfan Xue, Andrew Owens, Qiurui He, Jiajun Wu, Stefanie Mueller, and William T Freeman. 2018. Mosculp: Interactive visualization of shape and time. In Proc. ACM Symposium on User Interface Software and Technology (UIST).

Digital Library

[64]

Zhoutong Zhang, Forrester Cole, Richard Tucker, William T Freeman, and Tali Dekel. 2021. Consistent depth of moving objects in video. ACM Transactions on Graphics (Proc. SIGGRAPH) 40, 4 (2021), 1--12.

Cited By

Liao JVan KXia ZSuzuki R(2024)RealityEffects: Augmenting 3D Volumetric Videos with Object-Centric Annotation and Dynamic Visual EffectsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661631(1248-1261)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3661631
Xia ZMonteiro KVan KSuzuki R(2023)RealityCanvas: Augmented Reality Sketching for Embedded and Responsive Scribble Animation EffectsProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606716(1-14)Online publication date: 29-Oct-2023
https://dl.acm.org/doi/10.1145/3586183.3606716

Index Terms

VideoDoodles: Hand-Drawn Animations on Videos with Scene-Aware Canvases
1. Computing methodologies
  1. Computer graphics
    1. Graphics systems and interfaces
    2. Image manipulation
      1. Computational photography

Recommendations

Hand-drawn looking volumetric effects in the peanuts movie
SIGGRAPH '15: ACM SIGGRAPH 2015 Talks

For the first time in the franchise's history, the Peanuts gang is being brought to life in a CG feature film. The biggest challenge we faced was how to successfully maintain the iconic style established over the past 60 years. Some of the FX ...
Hand-drawn animation with self-shaped canvas
SIGGRAPH '17: ACM SIGGRAPH 2017 Posters

Although 3D CG tools produce similar style animations with conventional 2D animation, conventional hand-drawn animation has advantages that cannot be substituted in animation produced by those tools.

This paper introduces a new method to assist animators ...
Three-dimensional proxies for hand-drawn characters

Drawing shapes by hand and manipulating computer-generated objects are the two dominant forms of animation. Though each medium has its own advantages, the techniques developed for one medium are not easily leveraged in the other medium because hand ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 42, Issue 4

August 2023

1912 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/3609020

Issue’s Table of Contents

Copyright © 2023 ACM.

Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2023

Published in TOG Volume 42, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
307
Total Downloads

Downloads (Last 12 months)200
Downloads (Last 6 weeks)12

Reflects downloads up to 25 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liao JVan KXia ZSuzuki R(2024)RealityEffects: Augmenting 3D Volumetric Videos with Object-Centric Annotation and Dynamic Visual EffectsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661631(1248-1261)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3643834.3661631
Xia ZMonteiro KVan KSuzuki R(2023)RealityCanvas: Augmented Reality Sketching for Embedded and Responsive Scribble Animation EffectsProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606716(1-14)Online publication date: 29-Oct-2023
https://dl.acm.org/doi/10.1145/3586183.3606716

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents