FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis

24 May 2024 | Ke Fan, Junshu Tang, Weijian Cao, Ran Yi, Moran Li, Jingyu Gong, Jiangning Zhang, Yabiao Wang, Chengjie Wang, and Lizhuang Ma
FreeMotion is a unified framework for number-free text-to-motion synthesis. The paper addresses the limitations of existing methods that are tailored for single or two-person motion generation and cannot handle multi-person scenarios. The proposed framework unifies single and multi-person motion generation by modeling the conditional motion distribution. It introduces a generation module and an interaction module to decouple the motion generation process, enabling number-free motion synthesis. The generation module generates single-person motion based on text prompts, while the interaction module injects condition motions into the human motion generation process. The framework also supports precise spatial control of multi-person motion through flexible spatial signals. Extensive experiments show that FreeMotion outperforms existing methods in generating both single and multi-person motions. The framework is trained on two-person motions but can generate motions for more than two individuals during inference. The paper also discusses related work in single-person and multi-person motion synthesis, as well as diffusion models. The framework is evaluated on the InterHuman dataset, demonstrating its effectiveness in generating realistic and diverse motions. The results show that FreeMotion achieves superior performance in terms of fidelity, diversity, and spatial control. The framework is designed to be flexible and generalizable, enabling the synthesis of motions for any number of individuals.FreeMotion is a unified framework for number-free text-to-motion synthesis. The paper addresses the limitations of existing methods that are tailored for single or two-person motion generation and cannot handle multi-person scenarios. The proposed framework unifies single and multi-person motion generation by modeling the conditional motion distribution. It introduces a generation module and an interaction module to decouple the motion generation process, enabling number-free motion synthesis. The generation module generates single-person motion based on text prompts, while the interaction module injects condition motions into the human motion generation process. The framework also supports precise spatial control of multi-person motion through flexible spatial signals. Extensive experiments show that FreeMotion outperforms existing methods in generating both single and multi-person motions. The framework is trained on two-person motions but can generate motions for more than two individuals during inference. The paper also discusses related work in single-person and multi-person motion synthesis, as well as diffusion models. The framework is evaluated on the InterHuman dataset, demonstrating its effectiveness in generating realistic and diverse motions. The results show that FreeMotion achieves superior performance in terms of fidelity, diversity, and spatial control. The framework is designed to be flexible and generalizable, enabling the synthesis of motions for any number of individuals.
Reach us at info@study.space
Understanding FreeMotion%3A A Unified Framework for Number-free Text-to-Motion Synthesis