Human Image Personalization with High-fidelity Identity Preservation

Human Image Personalization with High-fidelity Identity Preservation

25 Mar 2024 | Shilong Zhang, Lianghua Huang, Xi Chen, Yifei Zhang, Zhi-Fan Wu, Yutong Feng, Wei Wang, Yujun Shen, Yu Liu, and Ping Luo
This paper introduces FlashFace, a practical tool for human image personalization that allows users to customize their photos by providing one or a few reference face images and a text prompt. The key contributions of FlashFace are: 1. **Higher-Fidelity Identity Preservation**: FlashFace encodes the reference face into a series of feature maps instead of a single image token, allowing for better retention of facial details such as scars, tattoos, and face shape. 2. **Better Instruction Following**: A disentangled integration strategy is introduced to balance the text and image guidance during the text-to-image generation process, enabling precise language control even when there is a conflict between the reference images and the text prompts. The method is evaluated on various applications, including human image personalization, face swapping under language prompts, and transforming virtual characters into real people. Experimental results demonstrate the effectiveness of FlashFace, showing superior performance in terms of identity preservation and language control compared to existing methods. The paper also includes a detailed analysis of the model's performance, ablation studies, and a benchmark comparison with other state-of-the-art methods.This paper introduces FlashFace, a practical tool for human image personalization that allows users to customize their photos by providing one or a few reference face images and a text prompt. The key contributions of FlashFace are: 1. **Higher-Fidelity Identity Preservation**: FlashFace encodes the reference face into a series of feature maps instead of a single image token, allowing for better retention of facial details such as scars, tattoos, and face shape. 2. **Better Instruction Following**: A disentangled integration strategy is introduced to balance the text and image guidance during the text-to-image generation process, enabling precise language control even when there is a conflict between the reference images and the text prompts. The method is evaluated on various applications, including human image personalization, face swapping under language prompts, and transforming virtual characters into real people. Experimental results demonstrate the effectiveness of FlashFace, showing superior performance in terms of identity preservation and language control compared to existing methods. The paper also includes a detailed analysis of the model's performance, ablation studies, and a benchmark comparison with other state-of-the-art methods.
Reach us at info@study.space
Understanding FlashFace%3A Human Image Personalization with High-fidelity Identity Preservation