MotionMaster: Training-free Camera Motion Transfer For Video Generation

MotionMaster: Training-free Camera Motion Transfer For Video Generation

1 May 2024 | Teng Hu*, Jiangning Zhang*, Ran Yi†, Yating Wang, Hongrui Huang, Jieyu Weng, Yabiao Wang, Lizhuang Ma
**MotionMaster: Training-free Camera Motion Transfer for Video Generation** **Abstract:** The emergence of diffusion models has significantly advanced image and video generation. Controllable video generation, including text-to-video, image-to-video, video editing, and video motion control, has gained attention. However, existing camera motion control methods require training a temporal camera module, which is computationally expensive and limits the flexibility of camera control. To address these issues, we propose MotionMaster, a novel training-free camera motion transfer model. MotionMaster disentangles camera motions and object motions in source videos and transfers the extracted camera motions to new videos. We introduce a one-shot camera motion disentanglement method to extract camera motion from a single video using a segmentation model and solving a Poisson equation. Additionally, we propose a few-shot camera motion disentanglement method to extract common camera motion from multiple videos with similar camera motions using window-based clustering. Finally, we develop a camera motion combination method to enable flexible camera control. Extensive experiments demonstrate that MotionMaster effectively decouples camera-object motion and achieves flexible and diverse camera motion control in controllable video generation tasks. **Keywords:** Video Generation, Video Motion, Camera Motion, Disentanglement **Introduction:** Recent advancements in generative models have led to significant progress in image and video generation. Controllable video generation, particularly in applications like film production, virtual reality, and video games, has become increasingly important. Existing camera motion control methods often rely on training a temporal camera module, which is computationally intensive and limits the flexibility of camera control. MotionMaster addresses these limitations by disentangling camera motions and object motions in source videos and transferring the extracted camera motions to new videos. Our method includes a one-shot camera motion disentanglement method and a few-shot camera motion disentanglement method, along with a camera motion combination method. Extensive experiments show that MotionMaster can effectively decouple camera-object motion and achieve flexible and diverse camera motion control in controllable video generation tasks. **Method:** MotionMaster aims to disentangle camera motion and object motion in videos and transfer the extracted camera motion to new videos. We propose a one-shot camera motion disentanglement method to extract camera motion from a single video using a segmentation model and solving a Poisson equation. For few-shot camera motion disentanglement, we extract common camera motion from multiple videos with similar camera motions using window-based clustering. We also propose a camera motion combination method to enable flexible camera motion control, including combining different camera motions and applying different camera motions in different regions. **Experiments:** We evaluate MotionMaster using various metrics such as FVD, FID-V, and Optical Flow Distance. Results show that MotionMaster outperforms state-of-the-art methods in terms of generation quality, diversity, and camera motion accuracy. MotionMaster can generate high-quality and diverse videos with only one-shot or few-shot data, without the need for training**MotionMaster: Training-free Camera Motion Transfer for Video Generation** **Abstract:** The emergence of diffusion models has significantly advanced image and video generation. Controllable video generation, including text-to-video, image-to-video, video editing, and video motion control, has gained attention. However, existing camera motion control methods require training a temporal camera module, which is computationally expensive and limits the flexibility of camera control. To address these issues, we propose MotionMaster, a novel training-free camera motion transfer model. MotionMaster disentangles camera motions and object motions in source videos and transfers the extracted camera motions to new videos. We introduce a one-shot camera motion disentanglement method to extract camera motion from a single video using a segmentation model and solving a Poisson equation. Additionally, we propose a few-shot camera motion disentanglement method to extract common camera motion from multiple videos with similar camera motions using window-based clustering. Finally, we develop a camera motion combination method to enable flexible camera control. Extensive experiments demonstrate that MotionMaster effectively decouples camera-object motion and achieves flexible and diverse camera motion control in controllable video generation tasks. **Keywords:** Video Generation, Video Motion, Camera Motion, Disentanglement **Introduction:** Recent advancements in generative models have led to significant progress in image and video generation. Controllable video generation, particularly in applications like film production, virtual reality, and video games, has become increasingly important. Existing camera motion control methods often rely on training a temporal camera module, which is computationally intensive and limits the flexibility of camera control. MotionMaster addresses these limitations by disentangling camera motions and object motions in source videos and transferring the extracted camera motions to new videos. Our method includes a one-shot camera motion disentanglement method and a few-shot camera motion disentanglement method, along with a camera motion combination method. Extensive experiments show that MotionMaster can effectively decouple camera-object motion and achieve flexible and diverse camera motion control in controllable video generation tasks. **Method:** MotionMaster aims to disentangle camera motion and object motion in videos and transfer the extracted camera motion to new videos. We propose a one-shot camera motion disentanglement method to extract camera motion from a single video using a segmentation model and solving a Poisson equation. For few-shot camera motion disentanglement, we extract common camera motion from multiple videos with similar camera motions using window-based clustering. We also propose a camera motion combination method to enable flexible camera motion control, including combining different camera motions and applying different camera motions in different regions. **Experiments:** We evaluate MotionMaster using various metrics such as FVD, FID-V, and Optical Flow Distance. Results show that MotionMaster outperforms state-of-the-art methods in terms of generation quality, diversity, and camera motion accuracy. MotionMaster can generate high-quality and diverse videos with only one-shot or few-shot data, without the need for training
Reach us at info@study.space
[slides] MotionMaster%3A Training-free Camera Motion Transfer For Video Generation | StudySpace