Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels

Vidu4D: Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels

27 May 2024 | Yikai Wang, Xinzhou Wang, Zilong Chen, Zhengyi Wang, Fuchun Sun, Jun Zhu
Vidu4D is a novel reconstruction model designed to accurately generate high-fidelity 4D (sequential 3D) representations from single generated videos, addressing challenges such as non-rigidity and frame distortion. The core of Vidu4D is the Dynamic Gaussian Surfels (DGS) technique, which optimizes time-varying warping functions to transform Gaussian surfels from a static state to a dynamically warped state, enabling precise motion and deformation capture over time. DGS includes geometric regularization based on continuous warping fields for estimating normals and refinement on rotation and scaling parameters to reduce texture flickering and enhance fine-grained appearance details. Vidu4D also features a novel initialization state for the warping fields, ensuring proper start conditions. When integrated with an existing video generative model, Vidu4D demonstrates high-fidelity text-to-4D generation in both appearance and geometry. The method is evaluated through extensive experiments, showing superior performance compared to state-of-the-art methods in terms of novel-view color, normal, and surfel feature reconstruction.Vidu4D is a novel reconstruction model designed to accurately generate high-fidelity 4D (sequential 3D) representations from single generated videos, addressing challenges such as non-rigidity and frame distortion. The core of Vidu4D is the Dynamic Gaussian Surfels (DGS) technique, which optimizes time-varying warping functions to transform Gaussian surfels from a static state to a dynamically warped state, enabling precise motion and deformation capture over time. DGS includes geometric regularization based on continuous warping fields for estimating normals and refinement on rotation and scaling parameters to reduce texture flickering and enhance fine-grained appearance details. Vidu4D also features a novel initialization state for the warping fields, ensuring proper start conditions. When integrated with an existing video generative model, Vidu4D demonstrates high-fidelity text-to-4D generation in both appearance and geometry. The method is evaluated through extensive experiments, showing superior performance compared to state-of-the-art methods in terms of novel-view color, normal, and surfel feature reconstruction.
Reach us at info@study.space
Understanding Vidu4D%3A Single Generated Video to High-Fidelity 4D Reconstruction with Dynamic Gaussian Surfels