Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm

18 Mar 2024 | Yi Wu*, Ziqiang Li*, Heliang Zheng1, Chaoyue Wang2, and Bin Li**
Infinite-ID: Identity-Preserved Personalization via ID-Semantics Decoupling Paradigm This paper proposes Infinite-ID, an ID-semantics decoupling paradigm for identity-preserved personalization, which enables high-fidelity identity preservation and semantic consistency in text-to-image generation using a single reference image. The method introduces identity-enhanced training, incorporating an additional image cross-attention module to capture sufficient ID information while deactivating the original text cross-attention module of the diffusion model. This ensures that the image stream faithfully represents the identity provided by the reference image while mitigating interference from textual input. Additionally, a feature interaction mechanism combining a mixed attention module with an AdaIN-mean operation is introduced to seamlessly merge the two streams. This mechanism enhances identity fidelity and semantic consistency while enabling convenient control over the styles of the generated images. Experimental results on both raw photo generation and style image generation demonstrate the superior performance of the proposed method. The method addresses the challenge of maintaining identity fidelity and semantic consistency in identity-preserved personalization by decoupling identity and semantic information. It introduces identity-enhanced training that captures ID information separately from text information, and a feature interaction mechanism that effectively merges ID and text information. The method also incorporates an adaptive mean normalization (AdaIN-mean) operation to precisely align the style of the synthesized image with the desired style prompts. The proposed method is evaluated on both raw photo generation and style image generation tasks. It outperforms existing methods in terms of identity fidelity and semantic consistency. The method is also robust to changes in input image resolution, demonstrating its effectiveness in various scenarios. The results show that Infinite-ID maintains strong identity fidelity, high-quality semantic consistency, and precise stylization using only a single reference image.Infinite-ID: Identity-Preserved Personalization via ID-Semantics Decoupling Paradigm This paper proposes Infinite-ID, an ID-semantics decoupling paradigm for identity-preserved personalization, which enables high-fidelity identity preservation and semantic consistency in text-to-image generation using a single reference image. The method introduces identity-enhanced training, incorporating an additional image cross-attention module to capture sufficient ID information while deactivating the original text cross-attention module of the diffusion model. This ensures that the image stream faithfully represents the identity provided by the reference image while mitigating interference from textual input. Additionally, a feature interaction mechanism combining a mixed attention module with an AdaIN-mean operation is introduced to seamlessly merge the two streams. This mechanism enhances identity fidelity and semantic consistency while enabling convenient control over the styles of the generated images. Experimental results on both raw photo generation and style image generation demonstrate the superior performance of the proposed method. The method addresses the challenge of maintaining identity fidelity and semantic consistency in identity-preserved personalization by decoupling identity and semantic information. It introduces identity-enhanced training that captures ID information separately from text information, and a feature interaction mechanism that effectively merges ID and text information. The method also incorporates an adaptive mean normalization (AdaIN-mean) operation to precisely align the style of the synthesized image with the desired style prompts. The proposed method is evaluated on both raw photo generation and style image generation tasks. It outperforms existing methods in terms of identity fidelity and semantic consistency. The method is also robust to changes in input image resolution, demonstrating its effectiveness in various scenarios. The results show that Infinite-ID maintains strong identity fidelity, high-quality semantic consistency, and precise stylization using only a single reference image.
Reach us at info@study.space
[slides and audio] Infinite-ID%3A Identity-preserved Personalization via ID-semantics Decoupling Paradigm