27 May 2024 | Yikai Wang, Xinzhou Wang, Zilong Chen, Zhengyi Wang, Fuchun Sun, Jun Zhu
Vidu4D is a novel 4D reconstruction model that accurately reconstructs sequential 3D representations from single generated videos, addressing challenges such as non-rigidity and frame distortion. The core of Vidu4D is Dynamic Gaussian Surfels (DGS), which optimizes time-varying warping functions to transform Gaussian surfels from a static state to a dynamically warped state, enabling precise depiction of motion and deformation over time. DGS also incorporates warped-state geometric regularization and refined rotation and scaling parameters of Gaussian surfels to enhance the capture of fine-grained appearance details and reduce texture flickering during warping. Vidu4D is equipped with an existing video generative model, demonstrating high-fidelity text-to-4D generation in both appearance and geometry. The method is evaluated on generated videos, showing superior performance compared to state-of-the-art methods in both qualitative and quantitative evaluations. Vidu4D outperforms existing methods in generating realistic and immersive 4D content. Limitations include reliance on video quality, scalability challenges for large scenes, and computational difficulties in real-time applications.Vidu4D is a novel 4D reconstruction model that accurately reconstructs sequential 3D representations from single generated videos, addressing challenges such as non-rigidity and frame distortion. The core of Vidu4D is Dynamic Gaussian Surfels (DGS), which optimizes time-varying warping functions to transform Gaussian surfels from a static state to a dynamically warped state, enabling precise depiction of motion and deformation over time. DGS also incorporates warped-state geometric regularization and refined rotation and scaling parameters of Gaussian surfels to enhance the capture of fine-grained appearance details and reduce texture flickering during warping. Vidu4D is equipped with an existing video generative model, demonstrating high-fidelity text-to-4D generation in both appearance and geometry. The method is evaluated on generated videos, showing superior performance compared to state-of-the-art methods in both qualitative and quantitative evaluations. Vidu4D outperforms existing methods in generating realistic and immersive 4D content. Limitations include reliance on video quality, scalability challenges for large scenes, and computational difficulties in real-time applications.