Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts

Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts

13 Mar 2024 | Yue Ma*, Yingqing He*, Hongfa Wang*, Andong Wang, Chenyang Qi, Chengfei Cai, Xiu Li, Zhifeng Li, Heung-Yeung Shum, Wei Liu, and Qifeng Chen
Follow-Your-Click is a novel framework for regional image animation using a user click and a short motion prompt. The framework enables precise control over local image animation by combining a user-provided click (specifying what to move) with a short motion prompt (specifying how to move). It addresses the limitations of existing image-to-video (I2V) methods, which often lack local control and require detailed scene descriptions. The framework introduces a first-frame masking strategy to enhance video generation quality and a motion-augmented module to improve short prompt following. It also proposes flow-based motion magnitude control for precise motion speed regulation. The framework achieves better generation performance and user control compared to existing methods. Extensive experiments on 8 metrics show that Follow-Your-Click outperforms 7 baselines, including commercial tools and research methods. The framework supports multiple object and moving types via multiple clicks and integrates with control signals like human skeletons for fine-grained motion control. It is the first framework to support a simple click and short motion prompt for regional image animation. The framework is evaluated on various metrics and user studies, demonstrating superior performance in video-text alignment, temporal consistency, and motion quality. The framework is limited in generating large and complex human motions due to dataset bias and motion complexity.Follow-Your-Click is a novel framework for regional image animation using a user click and a short motion prompt. The framework enables precise control over local image animation by combining a user-provided click (specifying what to move) with a short motion prompt (specifying how to move). It addresses the limitations of existing image-to-video (I2V) methods, which often lack local control and require detailed scene descriptions. The framework introduces a first-frame masking strategy to enhance video generation quality and a motion-augmented module to improve short prompt following. It also proposes flow-based motion magnitude control for precise motion speed regulation. The framework achieves better generation performance and user control compared to existing methods. Extensive experiments on 8 metrics show that Follow-Your-Click outperforms 7 baselines, including commercial tools and research methods. The framework supports multiple object and moving types via multiple clicks and integrates with control signals like human skeletons for fine-grained motion control. It is the first framework to support a simple click and short motion prompt for regional image animation. The framework is evaluated on various metrics and user studies, demonstrating superior performance in video-text alignment, temporal consistency, and motion quality. The framework is limited in generating large and complex human motions due to dataset bias and motion complexity.
Reach us at info@study.space