Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

VideoDoodles: Hand-Drawn Animations on Videos with Scene-Aware Canvases

Published: 26 July 2023 Publication History

Abstract

We present an interactive system to ease the creation of so-called video doodles - videos on which artists insert hand-drawn animations for entertainment or educational purposes. Video doodles are challenging to create because to be convincing, the inserted drawings must appear as if they were part of the captured scene. In particular, the drawings should undergo tracking, perspective deformations and occlusions as they move with respect to the camera and to other objects in the scene - visual effects that are difficult to reproduce with existing 2D video editing software. Our system supports these effects by relying on planar canvases that users position in a 3D scene reconstructed from the video. Furthermore, we present a custom tracking algorithm that allows users to anchor canvases to static or dynamic objects in the scene, such that the canvases move and rotate to follow the position and direction of these objects. When testing our system, novices could create a variety of short animated clips in a dozen of minutes, while professionals praised its speed and ease of use compared to existing tools.

Supplementary Material

ZIP File (papers_558-supplemental.zip)
supplemental material
MP4 File (papers_558_VOD.mp4)
presentation

References

[1]
Adobe. 2022. After Effects. https://www.adobe.com/products/aftereffects.html.
[2]
Aseem Agarwala, Aaron Hertzmann, David H Salesin, and Steven M Seitz. 2004. Keyframe-based tracking for rotoscoping and animation. ACM Transactions on Graphics (Proc. SIGGRAPH) 23, 3 (2004).
[3]
Brian Amberg and Thomas Vetter. 2011. GraphTrack: Fast and globally optimal tracking in videos. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[4]
Apple. 2022. ARKit. https://developer.apple.com/augmented-reality/arkit/.
[5]
Rahul Arora, Rubaiat Habib Kazi, Tovi Grossman, George Fitzmaurice, and Karan Singh. 2018. SymbiosisSketch: Combining 2D & 3D Sketching for Designing Detailed 3D Objects in Situ. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems.
[6]
Seok-Hyung Bae, Ravin Balakrishnan, and Karan Singh. 2008. ILoveSketch: as-natural-as-possible sketching system for creating 3d curve models. In ACM Symposium on User Interface Software and Technology (UIST).
[7]
Zhangxing Bian, Allan Jabri, Alexei A. Efros, and Andrew Owens. 2022. Learning Pixel Trajectories with Multiscale Contrastive Random Walks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
[8]
Blender. 2022a. Blender Grease Pencil. https://www.blender.org/features/story-artist/.
[9]
Blender. 2022b. Blender Motion Tracking. https://docs.blender.org/manual/en/latest/movie_clip/tracking/index.html.
[10]
BorisFX. 2022. Mocha Pro. https://borisfx.com/products/mocha-pro/.
[11]
Nicolas Boumal. 2013. Interpolation and regression of rotation matrices. In International Conference on Geometric Science of Information. Springer, 345--352.
[12]
Aeron Buchanan and Andrew Fitzgibbon. 2006. Interactive feature tracking using kd trees and dynamic programming. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[13]
Mental Canvas. 2022. Mental Canvas Application. https://mentalcanvas.com/.
[14]
Z. Cao, G. Hidalgo Martinez, T. Simon, S. Wei, and Y. A. Sheikh. 2019. OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2019).
[15]
Carl Doersch, Ankush Gupta, Larisa Markeeva, Adria Recasens Continente, Kucas Smaira, Yusuf Aytar, Joao Carreira, Andrew Zisserman, and Yi Yang. 2022. TAP-Vid: A Benchmark for Tracking Any Point in a Video. In NeurIPS Datasets Track.
[16]
Julie Dorsey, Songhua Xu, Gabe Smedresman, Holly Rushmeier, and Leonard McMillan. 2007. The mental canvas: A tool for conceptual architectural design and analysis. In Pacific Conference on Computer Graphics and Applications.
[17]
Pierre Dragicevic, Gonzalo Ramos, Jacobo Bibliowitcz, Derek Nowrouzezahrai, Ravin Balakrishnan, and Karan Singh. 2008. Video browsing by direct manipulation. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems.
[18]
Ruofei Du, Eric Turner, Maksym Dzitsiuk, Luca Prasso, Ivo Duarte, Jason Dourgarian, Joao Afonso, Jose Pascoal, Josh Gladstone, Nuno Cruces, et al. 2020. DepthLab: Real-time 3D interaction with depth maps for mobile augmented reality. In Proc. ACM Symposium on User Interface Software and Technology (UIST).
[19]
Foundry. 2022. Nuke. https://www.foundry.com/products/nuke-family/nuke.
[20]
Dan B Goldman, Brian Curless, David Salesin, and Steven M Seitz. 2006. Schematic storyboarding for video visualization and editing. ACM Transactions on Graphics (Proc. SIGGRAPH) 25, 3 (2006).
[21]
Dan B Goldman, Chris Gonterman, Brian Curless, David Salesin, and Steven M Seitz. 2008. Video object annotation, navigation, and composition. In Proc. ACM symposium on User Interface Software and Technology (UIST). 3--12.
[22]
Rıza Alp Güler, Natalia Neverova, and Iasonas Kokkinos. 2018. Densepose: Dense human pose estimation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7297--7306.
[23]
João F Henriques, Rui Caseiro, Pedro Martins, and Jorge Batista. 2014. High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 3 (2014), 583--596.
[24]
Eric Horvitz. 1999. Principles of mixed-initiative user interfaces. In Proc. ACM SIGCHI conference on Human Factors in Computing Systems.
[25]
Riwano Ikeda and Issei Fujishiro. 2021. SpiCa: Stereoscopic Effect Design with 3D Pottery Wheel-Type Transparent Canvas. In ACM SIGGRAPH Asia 2021 Technical Communications.
[26]
Allan Jabri, Andrew Owens, and Alexei A Efros. 2020. Space-Time Correspondence as a Contrastive Random Walk. Advances in Neural Information Processing Systems (2020).
[27]
Yoni Kasten, Dolev Ofri, Oliver Wang, and Tali Dekel. 2021. Layered neural atlases for consistent video editing. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 40, 6 (2021), 1--12.
[28]
Rubaiat Habib Kazi, Fanny Chevalier, Tovi Grossman, Shengdong Zhao, and George Fitzmaurice. 2014. Draco: Bringing Life to Illustrations with Kinetic Textures. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems.
[29]
KenTools. 2022. GeoTracker. https://keentools.io/products/geotracker-for-after-effects.
[30]
Felix Klose, Oliver Wang, Jean-Charles Bazin, Marcus Magnor, and Alexander Sorkine-Hornung. 2015. Sampling based scene-space video processing. ACM Transactions on Graphics (Proc. SIGGRAPH) 34, 4 (2015), 1--11.
[31]
Johannes Kopf, Michael F. Cohen, and Richard Szeliski. 2014. First-person Hyper-lapse videos. ACM Transactions on Graphics (Proc. SIGGRAPH) 33, 4 (2014).
[32]
Johannes Kopf, Kevin Matzen, Suhib Alsisan, Ocean Quigley, Francis Ge, Yangming Chong, Josh Patterson, Jan-Michael Frahm, Shu Wu, Matthew Yu, et al. 2020. One shot 3D photography. ACM Transactions on Graphics (Proc. SIGGRAPH) 39, 4 (2020).
[33]
Johannes Kopf, Xuejian Rong, and Jia-Bin Huang. 2021. Robust Consistent Video Depth Estimation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[34]
Germán Leiva, Cuong Nguyen, Rubaiat Habib Kazi, and Paul Asente. 2020. Pronto: Rapid augmented reality video prototyping using sketches and enaction. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems. 1--13.
[35]
Wenbin Li, Fabio Viola, Jonathan Starck, Gabriel J Brostow, and Neill DF Campbell. 2016. Roto++ accelerating professional rotoscoping using shape manifolds. ACM Transactions on Graphics (Proc. SIGGRAPH) 35, 4 (2016).
[36]
Yuwei Li, Xi Luo, Youyi Zheng, Pengfei Xu, and Hongbo Fu. 2017. SweepCanvas: Sketch-based 3D prototyping on an RGB-D image. In Proc. ACM Symposium on User Interface Software and Technology (UIST). 387--399.
[37]
Pengpeng Liang, Yifan Wu, Hu Lu, Liming Wang, Chunyuan Liao, and Haibin Ling. 2018. Planar object tracking in the wild: A benchmark. In IEEE International Conference on Robotics and Automation (ICRA). IEEE.
[38]
Jian Liao, Adnan Karim, Shivesh Singh Jadon, Rubaiat Habib Kazi, and Ryo Suzuki. 2022. RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling. In Proc. ACM Symposium on User Interface Software and Technology (UIST). Article 17, 12 pages.
[39]
Jingyuan Liu, Hongbo Fu, and Chiew-Lan Tai. 2020. Posetween: Pose-driven tween animation. In Proc. ACM Symposium on User Interface Software and Technology (UIST).
[40]
Shaowei Liu, Subarna Tripathi, Somdeb Majumdar, and Xiaolong Wang. 2022b. Joint hand motion and interaction hotspots prediction from egocentric videos. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3282--3292.
[41]
Sean J Liu, Maneesh Agrawala, Stephen DiVerdi, and Aaron Hertzmann. 2022a. ZoomShop: Depth-Aware Editing of Photographic Composition. In Computer Graphics Forum, Vol. 41. Wiley Online Library, 57--70.
[42]
Xuan Luo, Jia-Bin Huang, Richard Szeliski, Kevin Matzen, and Johannes Kopf. 2020. Consistent video depth estimation. ACM Transactions on Graphics (Proc. SIGGRAPH) 39, 4 (2020), 71--1.
[43]
Maximilian Mayer, Philipp Trenz, Sebastian Pasewaldt, Mandy Klingbeil, Jürgen Döllner, Matthias Trapp, and Amir Semmo. 2021. MotionViz: Artistic Visualization of Human Motion on Mobile Devices. In ACM SIGGRAPH 2021 Appy Hour.
[44]
Raul Mur-Artal, Jose Maria Martinez Montiel, and Juan D Tardos. 2015. ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Transactions on Robotics 31, 5 (2015).
[45]
Cuong Nguyen, Yuzhen Niu, and Feng Liu. 2013. Direct Manipulation Video Navigation in 3D. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems.
[46]
Seoung Wug Oh, Joon-Young Lee, Kalyan Sunkavalli, and Seon Joo Kim. 2018. Fast video object segmentation by reference-guided mask propagation. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[47]
F. Perazzi, J. Pont-Tuset, B. McWilliams, L. Van Gool, M. Gross, and A. Sorkine-Hornung. 2016. A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation. In Computer Vision and Pattern Recognition.
[48]
Jordi Pont-Tuset, Federico Perazzi, Sergi Caelles, Pablo Arbeláez, Alexander Sorkine-Hornung, and Luc Van Gool. 2017. The 2017 DAVIS Challenge on Video Object Segmentation. arXiv:1704.00675 (2017).
[49]
Alex Rav-Acha, Pushmeet Kohli, Carsten Rother, and Andrew Fitzgibbon. 2008. Unwrap mosaics: A new representation for video editing. ACM Transactions on Graphics (Proc. SIGGRAPH) (2008).
[50]
Runway. 2022. RunwayML. https://app.runwayml.com/.
[51]
Nazmus Saquib, Rubaiat Habib Kazi, Li-Yi Wei, and Wilmot Li. 2019. Interactive Body-Driven Graphics for Augmented Video Performance. In Proc. ACM CHI Conference on Human Factors in Computing Systems.
[52]
Ryan Schmidt, Azam Khan, Gord Kurtenbach, and Karan Singh. 2009. On Expert Performance in 3D Curve-Drawing Tasks. In Proc. Symposium on Sketch-Based Interfaces and Modeling (SBIM).
[53]
Johannes Lutz Schönberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
[54]
Noah Snavely, C Lawrence Zitnick, Sing Bing Kang, and Michael Cohen. 2006. Stylizing 2.5-D video. In Proc. Symposium on Non-Photorealistic Animation and Rendering.
[55]
Tibor Stanko, Stefanie Hahmann, Georges-Pierre Bonneau, and Nathalie Saguin-Sprynski. 2017. Shape from sensors: Curve networks on surfaces from 3D orientations. Computers & Graphics (Proc. SMI) 66 (2017).
[56]
Qingkun Su, Xue Bai, Hongbo Fu, Chiew-Lan Tai, and Jue Wang. 2018. Live sketch: Video-driven dynamic deformation of static drawings. In Proc. ACM SIGCHI Conference on Human Factors in Computing Systems. 1--12.
[57]
Ryo Suzuki, Rubaiat Habib Kazi, Li-yi Wei, Stephen DiVerdi, Wilmot Li, and Daniel Leithinger. 2020. RealitySketch: Embedding Responsive Graphics and Visualizations in AR through Dynamic Sketching. In Proc. ACM Symposium on User Interface Software and Technology (UIST).
[58]
Zachary Teed and Jia Deng. 2020. RAFT: Recurrent all-pairs field transforms for optical flow. In European Conference on Computer Vision (ECCV). 402--419.
[59]
James Townsend, Niklas Koep, and Sebastian Weichwald. 2016. Pymanopt: A Python Toolbox for Optimization on Manifolds using Automatic Differentiation. Journal of Machine Learning Research 17, 137 (2016), 1--5.
[60]
Julien Valentin, Adarsh Kowdle, Jonathan T Barron, Neal Wadhwa, Max Dzitsiuk, Michael Schoenberg, Vivek Verma, Ambrus Csaszar, Eric Turner, Ivan Dryanovski, et al. 2018. Depth from motion for smartphone AR. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 37, 6 (2018), 1--19.
[61]
Nora S. Willett, Wilmot Li, Jovan Popovic, Floraine Berthouzoz, and Adam Finkelstein. 2017. Secondary Motion for Performed 2D Animation. In Proc. ACM Symposium on User Interface Software and Technology (UIST).
[62]
Yu Xiang, Tanner Schmidt, Venkatraman Narayanan, and Dieter Fox. 2018. PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes. Robotics: Science and Systems (RSS) (2018).
[63]
Xiuming Zhang, Tali Dekel, Tianfan Xue, Andrew Owens, Qiurui He, Jiajun Wu, Stefanie Mueller, and William T Freeman. 2018. Mosculp: Interactive visualization of shape and time. In Proc. ACM Symposium on User Interface Software and Technology (UIST).
[64]
Zhoutong Zhang, Forrester Cole, Richard Tucker, William T Freeman, and Tali Dekel. 2021. Consistent depth of moving objects in video. ACM Transactions on Graphics (Proc. SIGGRAPH) 40, 4 (2021), 1--12.

Cited By

View all
  • (2024)RealityEffects: Augmenting 3D Volumetric Videos with Object-Centric Annotation and Dynamic Visual EffectsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661631(1248-1261)Online publication date: 1-Jul-2024
  • (2023)RealityCanvas: Augmented Reality Sketching for Embedded and Responsive Scribble Animation EffectsProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606716(1-14)Online publication date: 29-Oct-2023

Index Terms

  1. VideoDoodles: Hand-Drawn Animations on Videos with Scene-Aware Canvases

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 42, Issue 4
      August 2023
      1912 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/3609020
      Issue’s Table of Contents
      Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 July 2023
      Published in TOG Volume 42, Issue 4

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. video editing
      2. motion design
      3. interface
      4. video depth

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)200
      • Downloads (Last 6 weeks)12
      Reflects downloads up to 25 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)RealityEffects: Augmenting 3D Volumetric Videos with Object-Centric Annotation and Dynamic Visual EffectsProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661631(1248-1261)Online publication date: 1-Jul-2024
      • (2023)RealityCanvas: Augmented Reality Sketching for Embedded and Responsive Scribble Animation EffectsProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606716(1-14)Online publication date: 29-Oct-2023

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media