TC4D: Trajectory-Conditioned Text-to-4D Generation

TC4D: Trajectory-Conditioned Text-to-4D Generation

2024-04-11 | Sherwin Bahmani*1,2,3, Xian Liu*4, Yifan Wang*5, Ivan Skorokhodov3, Victor Rong1,2, Ziwei Liu6, Xihui Liu7, Jeong Joon Park8, Sergey Tulyakov3, Gordon Wetzstein5, Andrea Tagliasacchi1,9,10, and David B. Lindell1,2
The paper introduces TC4D (Trajectory-Conditioned Text-to-4D Generation), a novel approach to generating dynamic 3D scenes using text prompts. TC4D addresses the limitations of existing 4D generation methods by decomposing motion into global and local components, enabling the synthesis of scenes with arbitrary trajectories and more realistic motion. The method uses a rigid transformation along a spline parameterized trajectory for global motion and a deformation model for local motion, guided by supervision from a text-to-video model. This decomposition allows for the synthesis of large-scale motions that are more realistic and extensive compared to previous methods. The paper includes a user study and ablation experiments, demonstrating that TC4D significantly outperforms existing techniques in terms of motion quality, realism, and amount of motion. The approach also supports compositional scene generation and can handle multiple interacting objects. The authors discuss future directions, including end-to-end pipelines for generating initial layouts and trajectories, and the development of automated metrics for text-to-4D generation.The paper introduces TC4D (Trajectory-Conditioned Text-to-4D Generation), a novel approach to generating dynamic 3D scenes using text prompts. TC4D addresses the limitations of existing 4D generation methods by decomposing motion into global and local components, enabling the synthesis of scenes with arbitrary trajectories and more realistic motion. The method uses a rigid transformation along a spline parameterized trajectory for global motion and a deformation model for local motion, guided by supervision from a text-to-video model. This decomposition allows for the synthesis of large-scale motions that are more realistic and extensive compared to previous methods. The paper includes a user study and ablation experiments, demonstrating that TC4D significantly outperforms existing techniques in terms of motion quality, realism, and amount of motion. The approach also supports compositional scene generation and can handle multiple interacting objects. The authors discuss future directions, including end-to-end pipelines for generating initial layouts and trajectories, and the development of automated metrics for text-to-4D generation.
Reach us at info@study.space