OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models

OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models

20 Jul 2024 | Zhe Kong, Yong Zhang, Tianyu Yang, Tao Wang, Kaihao Zhang, Bizhu Wu, Guanying Chen, Wei Liu, Wenhan Luo
OMG is a novel framework for occlusion-friendly personalized multi-concept generation in diffusion models. The method addresses challenges in identity preservation, occlusion, and illumination harmony during multi-concept generation. It proposes a two-stage sampling approach, where the first stage collects visual comprehension information for handling occlusions, and the second stage integrates multiple concepts using noise blending. The method can be seamlessly combined with existing single-concept models like LoRA and InstantID without additional training. Extensive experiments show that OMG achieves superior performance in multi-concept personalization, with strong identity preservation and harmonious illumination. The framework is efficient and can generate images with multiple concepts directly using community-derived models. The method also introduces a concept noise blending strategy to mitigate identity degradation during multi-concept generation. The proposed method is evaluated on various datasets and compared with other methods, demonstrating its effectiveness in both single- and multi-concept customization. The results show that OMG outperforms other methods in identity alignment and image quality, particularly in multi-concept scenarios. The method is versatile and can be combined with different conditions and models, including ControlNet and style LoRAs, to enhance customization capabilities. Overall, OMG provides an effective solution for multi-concept personalization in text-to-image generation.OMG is a novel framework for occlusion-friendly personalized multi-concept generation in diffusion models. The method addresses challenges in identity preservation, occlusion, and illumination harmony during multi-concept generation. It proposes a two-stage sampling approach, where the first stage collects visual comprehension information for handling occlusions, and the second stage integrates multiple concepts using noise blending. The method can be seamlessly combined with existing single-concept models like LoRA and InstantID without additional training. Extensive experiments show that OMG achieves superior performance in multi-concept personalization, with strong identity preservation and harmonious illumination. The framework is efficient and can generate images with multiple concepts directly using community-derived models. The method also introduces a concept noise blending strategy to mitigate identity degradation during multi-concept generation. The proposed method is evaluated on various datasets and compared with other methods, demonstrating its effectiveness in both single- and multi-concept customization. The results show that OMG outperforms other methods in identity alignment and image quality, particularly in multi-concept scenarios. The method is versatile and can be combined with different conditions and models, including ControlNet and style LoRAs, to enhance customization capabilities. Overall, OMG provides an effective solution for multi-concept personalization in text-to-image generation.
Reach us at info@study.space