[slides and audio] Champ%3A Controllable and Consistent Human Image Animation with 3D Parametric Guidance

This paper introduces a novel methodology for human image animation that integrates a 3D parametric human model (SMPL) within a latent diffusion framework. The approach enhances shape alignment and motion guidance, improving the quality and realism of generated animations. Key contributions include: 1. **3D Parametric Model**: Utilizes the SMPL model to establish a unified representation of body shape and pose, capturing intricate geometric and motion characteristics from source videos. 2. **Motion Guidance**: Incorporates rendered depth images, normal maps, and semantic maps from SMPL sequences, along with skeleton-based motion guidance, to enrich the latent diffusion model with comprehensive 3D shape and detailed pose attributes. 3. **Multi-Layer Motion Fusion**: A multi-layer motion fusion module integrates self-attention mechanisms to fuse shape and motion latent representations, enhancing the model's ability to generate accurate and temporally consistent animations. 4. **Experimental Validation**: Demonstrates superior performance on benchmark datasets (TikTok and UBC fashion video datasets) and a novel in-the-wild dataset, showing enhanced generalization capabilities. The methodology is structured around three components: 1. **SMPL Model Integration**: Projects SMPL sequences onto the image space to generate depth, normal, and semantic maps. 2. **Skeleton-Based Motion Guidance**: Enhances precision in guiding intricate movements like facial expressions and finger movements. 3. **Multi-Layer Feature Embedding**: Utilizes self-attention mechanisms to integrate multi-layer feature embeddings conditioned on a latent video diffusion model, leading to precise image animation. The paper also includes a comprehensive evaluation, comparing the proposed approach with state-of-the-art methods, and discusses limitations and future directions.This paper introduces a novel methodology for human image animation that integrates a 3D parametric human model (SMPL) within a latent diffusion framework. The approach enhances shape alignment and motion guidance, improving the quality and realism of generated animations. Key contributions include: 1. **3D Parametric Model**: Utilizes the SMPL model to establish a unified representation of body shape and pose, capturing intricate geometric and motion characteristics from source videos. 2. **Motion Guidance**: Incorporates rendered depth images, normal maps, and semantic maps from SMPL sequences, along with skeleton-based motion guidance, to enrich the latent diffusion model with comprehensive 3D shape and detailed pose attributes. 3. **Multi-Layer Motion Fusion**: A multi-layer motion fusion module integrates self-attention mechanisms to fuse shape and motion latent representations, enhancing the model's ability to generate accurate and temporally consistent animations. 4. **Experimental Validation**: Demonstrates superior performance on benchmark datasets (TikTok and UBC fashion video datasets) and a novel in-the-wild dataset, showing enhanced generalization capabilities. The methodology is structured around three components: 1. **SMPL Model Integration**: Projects SMPL sequences onto the image space to generate depth, normal, and semantic maps. 2. **Skeleton-Based Motion Guidance**: Enhances precision in guiding intricate movements like facial expressions and finger movements. 3. **Multi-Layer Feature Embedding**: Utilizes self-attention mechanisms to integrate multi-layer feature embeddings conditioned on a latent video diffusion model, leading to precise image animation. The paper also includes a comprehensive evaluation, comparing the proposed approach with state-of-the-art methods, and discusses limitations and future directions.

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

1 Jun 2024 | Shenhao Zhu1, Junming Leo Chen2, Zuozhuo Dai3, Qingkun Su3, Yinghui Xu2, Xun Cao1, Yao Yao1, Hao Zhu†1, and Siyu Zhu†2

Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance

1 Jun 2024 | Shenhao Zhu*1, Junming Leo Chen*2, Zuozhuo Dai3, Qingkun Su3, Yinghui Xu2, Xun Cao1, Yao Yao1, Hao Zhu†1, and Siyu Zhu†2

1 Jun 2024 | Shenhao Zhu1, Junming Leo Chen2, Zuozhuo Dai3, Qingkun Su3, Yinghui Xu2, Xun Cao1, Yao Yao1, Hao Zhu†1, and Siyu Zhu†2