ParCo: Part-Coordination Text-to-Motion Synthesis

ParCo: Part-Coordination Text-to-Motion Synthesis

23 Jul 2024 | Qiran Zou*, Shangyuan Yuan*, Shian Du, Yu Wang, Chang Liu, Yi Xu, Jie Chen, and Xiangyang Ji
ParCo is a text-to-motion synthesis method that enables coordinated and fine-grained motion generation by enhancing the understanding of part motions and communication among different part motion generators. The method discretizes whole-body motion into multiple part motions, using VQ-VAEs to encode each part motion into quantized code sequences. These sequences are then used to train multiple lightweight transformers for part motion generation, which are coordinated through a part coordination module. This design allows for efficient motion synthesis with lower computational complexity and shorter generation time compared to existing methods. ParCo achieves superior performance on benchmark datasets such as HumanML3D and KIT-ML, demonstrating its effectiveness in generating realistic and coordinated motions that align with textual descriptions. The method also shows strong performance in handling complex textual descriptions involving different body parts, with lower computational complexity and better motion generation quality compared to other approaches. The design of ParCo enables precise part control and efficient motion synthesis, making it a promising solution for text-to-motion generation.ParCo is a text-to-motion synthesis method that enables coordinated and fine-grained motion generation by enhancing the understanding of part motions and communication among different part motion generators. The method discretizes whole-body motion into multiple part motions, using VQ-VAEs to encode each part motion into quantized code sequences. These sequences are then used to train multiple lightweight transformers for part motion generation, which are coordinated through a part coordination module. This design allows for efficient motion synthesis with lower computational complexity and shorter generation time compared to existing methods. ParCo achieves superior performance on benchmark datasets such as HumanML3D and KIT-ML, demonstrating its effectiveness in generating realistic and coordinated motions that align with textual descriptions. The method also shows strong performance in handling complex textual descriptions involving different body parts, with lower computational complexity and better motion generation quality compared to other approaches. The design of ParCo enables precise part control and efficient motion synthesis, making it a promising solution for text-to-motion generation.
Reach us at info@study.space