23 Apr 2024 | Weifeng Chen, Jiachang Zhang, Jie Wu, Hefeng Wu, Xuefeng Xiao, Liang Lin
ID-Aligner is a novel framework designed to enhance the performance of identity-preserving text-to-image generation (ID-T2I) through reward feedback learning. The framework introduces two key rewards: identity consistency reward and identity aesthetic reward, which are integrated into both LoRA-based and Adapter-based models. These rewards aim to improve the accuracy of identity preservation and the visual appeal of generated images. The method leverages face detection and recognition models to measure identity consistency and uses human-annotated preference data to provide aesthetic tuning signals. Extensive experiments on SD1.5 and SDXL diffusion models demonstrate the effectiveness of ID-Aligner, showing superior performance in identity preservation and aesthetic quality compared to existing methods. The framework is flexible and can be applied to various text-to-image models, making it a versatile tool for generating high-quality, identity-preserving images.ID-Aligner is a novel framework designed to enhance the performance of identity-preserving text-to-image generation (ID-T2I) through reward feedback learning. The framework introduces two key rewards: identity consistency reward and identity aesthetic reward, which are integrated into both LoRA-based and Adapter-based models. These rewards aim to improve the accuracy of identity preservation and the visual appeal of generated images. The method leverages face detection and recognition models to measure identity consistency and uses human-annotated preference data to provide aesthetic tuning signals. Extensive experiments on SD1.5 and SDXL diffusion models demonstrate the effectiveness of ID-Aligner, showing superior performance in identity preservation and aesthetic quality compared to existing methods. The framework is flexible and can be applied to various text-to-image models, making it a versatile tool for generating high-quality, identity-preserving images.