21 Mar 2024 | Siying Cui, Jia Guo, Xiang An, Jiankang Deng, Yongle Zhao, Xinyu Wei, Ziyong Feng
IDAdapter is a tuning-free method for text-to-image generation that enables personalized image creation from a single facial image without requiring test-time fine-tuning. The method integrates mixed features from multiple reference images of the same person during training, enhancing identity preservation and diversity in generated images. IDAdapter uses a combination of textual and visual injections along with a face identity loss to guide the model in generating images with diverse styles, angles, and expressions while maintaining high fidelity to the face. During training, the model incorporates mixed features from multiple reference images to enrich identity-related content details, allowing it to generate images with more diverse styles, expressions, and angles compared to previous works. Extensive evaluations demonstrate the effectiveness of IDAdapter, achieving both diversity and identity fidelity in generated images. The method is based on the Stable Diffusion model and introduces a Mixed Facial Features (MFF) module to control the decoupling of identity and non-identity features during the generation process. This module helps in generating images with enhanced diversity by leveraging rich detailed information from multiple reference images. IDAdapter also incorporates a personalized concept into the generation process through textual injection and visual injection, enabling the model to generate diverse and high-fidelity images while preserving the identity of the subject. The method is tested on various datasets and shows strong performance in terms of identity preservation, pose diversity, and expression diversity. The results demonstrate that IDAdapter can generate images with a wide range of styles, angles, and expressions while maintaining the identity of the subject.IDAdapter is a tuning-free method for text-to-image generation that enables personalized image creation from a single facial image without requiring test-time fine-tuning. The method integrates mixed features from multiple reference images of the same person during training, enhancing identity preservation and diversity in generated images. IDAdapter uses a combination of textual and visual injections along with a face identity loss to guide the model in generating images with diverse styles, angles, and expressions while maintaining high fidelity to the face. During training, the model incorporates mixed features from multiple reference images to enrich identity-related content details, allowing it to generate images with more diverse styles, expressions, and angles compared to previous works. Extensive evaluations demonstrate the effectiveness of IDAdapter, achieving both diversity and identity fidelity in generated images. The method is based on the Stable Diffusion model and introduces a Mixed Facial Features (MFF) module to control the decoupling of identity and non-identity features during the generation process. This module helps in generating images with enhanced diversity by leveraging rich detailed information from multiple reference images. IDAdapter also incorporates a personalized concept into the generation process through textual injection and visual injection, enabling the model to generate diverse and high-fidelity images while preserving the identity of the subject. The method is tested on various datasets and shows strong performance in terms of identity preservation, pose diversity, and expression diversity. The results demonstrate that IDAdapter can generate images with a wide range of styles, angles, and expressions while maintaining the identity of the subject.