Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control

9 Jul 2024 | Yue Han*, Junwei Zhu*, Keke He2, Xu Chen2, Yanhao Ge3, Wei Li3, Xiangtai Li4, Jiangning Zhang2, and Chengjie Wang2, Yong Liu†,1
The paper introduces Face-Adapter, an efficient and effective adapter designed for high-precision and high-fidelity face editing using pre-trained diffusion models. Face-Adapter addresses the limitations of current face reenactment and swapping methods, which primarily rely on GAN frameworks, by providing a unified model that can handle both tasks simultaneously. The key contributions of Face-Adapter include: 1. **Spatial Condition Generator (SCG)**: Predicts 3D landmarks and adapts the foreground mask, providing precise guidance for controlled generation. 2. **Identity Encoder (IE)**: Transfers face embeddings to the text space using a transformer decoder, improving identity consistency. 3. **Attribute Controller (AC)**: Integrates spatial conditions and detailed attributes, enabling conditional inpainting. Face-Adapter achieves superior performance in motion control precision, identity retention, and generation quality compared to fully fine-tuned GAN-based models. It is also seamlessly integrated with various StableDiffusion models, reducing training costs and preventing overfitting. The method is evaluated on datasets such as VoxCeleb1/2 and FaceForensics++, demonstrating its effectiveness in both face reenactment and swapping tasks.The paper introduces Face-Adapter, an efficient and effective adapter designed for high-precision and high-fidelity face editing using pre-trained diffusion models. Face-Adapter addresses the limitations of current face reenactment and swapping methods, which primarily rely on GAN frameworks, by providing a unified model that can handle both tasks simultaneously. The key contributions of Face-Adapter include: 1. **Spatial Condition Generator (SCG)**: Predicts 3D landmarks and adapts the foreground mask, providing precise guidance for controlled generation. 2. **Identity Encoder (IE)**: Transfers face embeddings to the text space using a transformer decoder, improving identity consistency. 3. **Attribute Controller (AC)**: Integrates spatial conditions and detailed attributes, enabling conditional inpainting. Face-Adapter achieves superior performance in motion control precision, identity retention, and generation quality compared to fully fine-tuned GAN-based models. It is also seamlessly integrated with various StableDiffusion models, reducing training costs and preventing overfitting. The method is evaluated on datasets such as VoxCeleb1/2 and FaceForensics++, demonstrating its effectiveness in both face reenactment and swapping tasks.
Reach us at info@study.space