21 Mar 2024 | Siying Cui, Jia Guo, Xiang An, Jiankang Deng, Yongle Zhao, Xinyu Wei, Ziyong Feng
**IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models**
**Authors:** Siying Cui, Jia Guo, Xiang An, Jiankang Deng, Yongle Zhao, Xinyu Wei, Ziyong Feng
**Institution:** Peking University, InsightFace, DeepGlint
**Abstract:**
Stable Diffusion has emerged as a powerful tool for generating personalized portraits, but existing methods face challenges such as test-time fine-tuning, the need for multiple input images, low identity preservation, and limited diversity. To address these issues, IDAdapter introduces a tuning-free approach that enhances diversity and identity preservation in personalized image generation from a single face image. By integrating personalized concepts through textual and visual injections and a face identity loss, IDAdapter enriches identity-related content during training, guiding the model to generate diverse styles, expressions, and angles. Extensive evaluations demonstrate the effectiveness of IDAdapter in achieving both diversity and identity fidelity in generated images.
**Key Contributions:**
1. A method that incorporates mixed features from multiple reference images during training, avoiding test-time fine-tuning.
2. Generation of varied angles, expressions, and styles guided by a single photo and text prompt.
3. Comprehensive experiments confirm superior performance in generating images that closely resemble the input face, exhibit diverse angles, and showcase a broader range of expressions.
**Method:**
IDAdapter leverages Stable Diffusion, a latent diffusion model, to generate personalized images. It introduces a Mixed Facial Features (MFF) module to decouple identity and non-identity features, enhancing diversity and preserving identity. The MFF module combines features from multiple reference images with textual prompts to guide the generation process. Additionally, IDAdapter incorporates a face identity loss to ensure the model generates diverse appearances while retaining the identity feature.
**Experiments:**
- **Qualitative Results:** IDAdapter outperforms baseline methods in terms of face fidelity and diversity.
- **Quantitative Results:** IDAdapter achieves the highest scores in identity preservation and diversity metrics.
- **Ablation Studies:** Various components of IDAdapter are analyzed to understand their impact on image quality.
**Conclusion:**
IDAdapter is a significant breakthrough in personalized avatar generation, enabling the creation of diverse and high-fidelity images from a single facial image without the need for fine-tuning during inference.**IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models**
**Authors:** Siying Cui, Jia Guo, Xiang An, Jiankang Deng, Yongle Zhao, Xinyu Wei, Ziyong Feng
**Institution:** Peking University, InsightFace, DeepGlint
**Abstract:**
Stable Diffusion has emerged as a powerful tool for generating personalized portraits, but existing methods face challenges such as test-time fine-tuning, the need for multiple input images, low identity preservation, and limited diversity. To address these issues, IDAdapter introduces a tuning-free approach that enhances diversity and identity preservation in personalized image generation from a single face image. By integrating personalized concepts through textual and visual injections and a face identity loss, IDAdapter enriches identity-related content during training, guiding the model to generate diverse styles, expressions, and angles. Extensive evaluations demonstrate the effectiveness of IDAdapter in achieving both diversity and identity fidelity in generated images.
**Key Contributions:**
1. A method that incorporates mixed features from multiple reference images during training, avoiding test-time fine-tuning.
2. Generation of varied angles, expressions, and styles guided by a single photo and text prompt.
3. Comprehensive experiments confirm superior performance in generating images that closely resemble the input face, exhibit diverse angles, and showcase a broader range of expressions.
**Method:**
IDAdapter leverages Stable Diffusion, a latent diffusion model, to generate personalized images. It introduces a Mixed Facial Features (MFF) module to decouple identity and non-identity features, enhancing diversity and preserving identity. The MFF module combines features from multiple reference images with textual prompts to guide the generation process. Additionally, IDAdapter incorporates a face identity loss to ensure the model generates diverse appearances while retaining the identity feature.
**Experiments:**
- **Qualitative Results:** IDAdapter outperforms baseline methods in terms of face fidelity and diversity.
- **Quantitative Results:** IDAdapter achieves the highest scores in identity preservation and diversity metrics.
- **Ablation Studies:** Various components of IDAdapter are analyzed to understand their impact on image quality.
**Conclusion:**
IDAdapter is a significant breakthrough in personalized avatar generation, enabling the creation of diverse and high-fidelity images from a single facial image without the need for fine-tuning during inference.