Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation

Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation

24 May 2024 | Mathis Petrovich, Or Litany, Umar Iqbal, Michael J. Black, Gül Varol, Xue Bin Peng, Davis Rempe
The paper introduces a new problem setting for text-driven 3D human motion synthesis, called multi-track timeline control. This approach allows users to specify multiple actions and their timings in a structured and intuitive timeline, enabling fine-grained control over the motion. The method, named Spatio-Temporal Motion Collage (STMC), is designed to handle the multi-track input by independently denoising each text prompt and then stitching the results together in both space and time. STMC leverages pre-trained motion diffusion models like MDM and MotionDiffuse, which are adapted to support the SMPL body representation. The paper evaluates STMC using a new multi-track timeline dataset and compares it to several baselines, demonstrating its effectiveness in generating realistic and semantically accurate motions. The results show that STMC outperforms existing methods in both semantic correctness and motion realism, as measured by quantitative metrics and a perceptual study.The paper introduces a new problem setting for text-driven 3D human motion synthesis, called multi-track timeline control. This approach allows users to specify multiple actions and their timings in a structured and intuitive timeline, enabling fine-grained control over the motion. The method, named Spatio-Temporal Motion Collage (STMC), is designed to handle the multi-track input by independently denoising each text prompt and then stitching the results together in both space and time. STMC leverages pre-trained motion diffusion models like MDM and MotionDiffuse, which are adapted to support the SMPL body representation. The paper evaluates STMC using a new multi-track timeline dataset and compares it to several baselines, demonstrating its effectiveness in generating realistic and semantically accurate motions. The results show that STMC outperforms existing methods in both semantic correctness and motion realism, as measured by quantitative metrics and a perceptual study.
Reach us at info@study.space