MoCoGAN: Decomposing Motion and Content for Video Generation

MoCoGAN: Decomposing Motion and Content for Video Generation

14 Dec 2017 | Sergey Tulyakov, Ming-Yu Liu, Xiaodong Yang, Jan Kautz
MoCoGAN is a novel framework for video generation that decomposes visual signals into content and motion. The framework generates videos by mapping a sequence of random vectors, each consisting of a content part and a motion part, to a sequence of video frames. The content part is kept fixed, while the motion part is realized as a stochastic process. To learn this decomposition, MoCoGAN introduces a novel adversarial learning scheme that utilizes both image and video discriminators. Extensive experimental results on various datasets demonstrate the effectiveness of MoCoGAN, showing its ability to generate videos with different content and motion, as well as videos with the same content but different motions. The framework outperforms state-of-the-art approaches in terms of content consistency, inception scores, and user preference scores. Additionally, MoCoGAN can generate videos with categorical dynamics, such as different facial expressions, and perform image-to-video translation tasks.MoCoGAN is a novel framework for video generation that decomposes visual signals into content and motion. The framework generates videos by mapping a sequence of random vectors, each consisting of a content part and a motion part, to a sequence of video frames. The content part is kept fixed, while the motion part is realized as a stochastic process. To learn this decomposition, MoCoGAN introduces a novel adversarial learning scheme that utilizes both image and video discriminators. Extensive experimental results on various datasets demonstrate the effectiveness of MoCoGAN, showing its ability to generate videos with different content and motion, as well as videos with the same content but different motions. The framework outperforms state-of-the-art approaches in terms of content consistency, inception scores, and user preference scores. Additionally, MoCoGAN can generate videos with categorical dynamics, such as different facial expressions, and perform image-to-video translation tasks.
Reach us at info@study.space
[slides and audio] MoCoGAN%3A Decomposing Motion and Content for Video Generation