Generative rendering: Controllable 4d-guided video generation with 2d diffusion models

S Cai, D Ceylan, M Gadelha… - Proceedings of the …, 2024 - openaccess.thecvf.com
Proceedings of the IEEE/CVF Conference on Computer Vision and …, 2024openaccess.thecvf.com
Traditional 3D content creation tools empower users to bring their imagination to life by
giving them direct control over a scene's geometry appearance motion and camera path.
Creating computer-generated videos however is a tedious manual process which can be
automated by emerging text-to-video diffusion models. Despite great promise video diffusion
models are difficult to control hindering users to apply their creativity rather than amplifying it.
To address this challenge we present a novel approach that combines the controllability of …
Abstract
Traditional 3D content creation tools empower users to bring their imagination to life by giving them direct control over a scene's geometry appearance motion and camera path. Creating computer-generated videos however is a tedious manual process which can be automated by emerging text-to-video diffusion models. Despite great promise video diffusion models are difficult to control hindering users to apply their creativity rather than amplifying it. To address this challenge we present a novel approach that combines the controllability of dynamic 3D meshes with the expressivity and editability of emerging diffusion models. For this purpose our approach takes an animated low-fidelity rendered mesh as input and injects the ground truth correspondence information obtained from the dynamic mesh into various stages of a pre-trained text-to-image generation model to output high-quality and temporally consistent frames. We demonstrate our approach on various examples where motion can be obtained by animating rigged assets or changing the camera path.
openaccess.thecvf.com