5 Apr 2024 | Gihyun Kwon, Simon Jenni, Dingzeyu Li, Joon-Young Lee, Jong Chul Ye, Fabian Caba Heilbron
Concept Weaver is a method for generating images that incorporate multiple custom concepts in text-to-image models. The method involves two steps: first, creating a template image aligned with the semantics of the input prompt, and then personalizing the template using a concept fusion strategy. The fusion strategy incorporates the appearance of the target concepts into the template image while retaining its structural details. The results show that the method can generate multiple custom concepts with higher identity fidelity compared to alternative approaches. Additionally, the method can seamlessly handle more than two concepts and closely follow the semantic meaning of the input prompt without blending appearances across different subjects. The method is also robust to different architectures and can be used in both full fine-tuning and Low-Rank adaptation. The method is evaluated on various datasets and shows superior performance in terms of text and concept alignment. The method is also applied to real image editing and can be extended to efficient LoRA fine-tuning. The method is shown to generate concept-aware outputs without any concept mixing problems.Concept Weaver is a method for generating images that incorporate multiple custom concepts in text-to-image models. The method involves two steps: first, creating a template image aligned with the semantics of the input prompt, and then personalizing the template using a concept fusion strategy. The fusion strategy incorporates the appearance of the target concepts into the template image while retaining its structural details. The results show that the method can generate multiple custom concepts with higher identity fidelity compared to alternative approaches. Additionally, the method can seamlessly handle more than two concepts and closely follow the semantic meaning of the input prompt without blending appearances across different subjects. The method is also robust to different architectures and can be used in both full fine-tuning and Low-Rank adaptation. The method is evaluated on various datasets and shows superior performance in terms of text and concept alignment. The method is also applied to real image editing and can be extended to efficient LoRA fine-tuning. The method is shown to generate concept-aware outputs without any concept mixing problems.