The paper introduces Face-Adapter, an efficient and effective adapter designed for high-precision and high-fidelity face editing using pre-trained diffusion models. Face-Adapter addresses the limitations of current face reenactment and swapping methods, which primarily rely on GAN frameworks, by providing a unified model that can handle both tasks simultaneously. The key contributions of Face-Adapter include:
1. **Spatial Condition Generator (SCG)**: Predicts 3D landmarks and adapts the foreground mask, providing precise guidance for controlled generation.
2. **Identity Encoder (IE)**: Transfers face embeddings to the text space using a transformer decoder, improving identity consistency.
3. **Attribute Controller (AC)**: Integrates spatial conditions and detailed attributes, enabling conditional inpainting.
Face-Adapter achieves superior performance in motion control precision, identity retention, and generation quality compared to fully fine-tuned GAN-based models. It is also seamlessly integrated with various StableDiffusion models, reducing training costs and preventing overfitting. The method is evaluated on datasets such as VoxCeleb1/2 and FaceForensics++, demonstrating its effectiveness in both face reenactment and swapping tasks.The paper introduces Face-Adapter, an efficient and effective adapter designed for high-precision and high-fidelity face editing using pre-trained diffusion models. Face-Adapter addresses the limitations of current face reenactment and swapping methods, which primarily rely on GAN frameworks, by providing a unified model that can handle both tasks simultaneously. The key contributions of Face-Adapter include:
1. **Spatial Condition Generator (SCG)**: Predicts 3D landmarks and adapts the foreground mask, providing precise guidance for controlled generation.
2. **Identity Encoder (IE)**: Transfers face embeddings to the text space using a transformer decoder, improving identity consistency.
3. **Attribute Controller (AC)**: Integrates spatial conditions and detailed attributes, enabling conditional inpainting.
Face-Adapter achieves superior performance in motion control precision, identity retention, and generation quality compared to fully fine-tuned GAN-based models. It is also seamlessly integrated with various StableDiffusion models, reducing training costs and preventing overfitting. The method is evaluated on datasets such as VoxCeleb1/2 and FaceForensics++, demonstrating its effectiveness in both face reenactment and swapping tasks.