MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

27 Feb 2024 | Dawei Zhou, You Li, Fan Ma, Xiaoting Zhang, Yi Yang
This paper introduces MIGC, a Multi-Instance Generation Controller for Text-to-Image Synthesis, which enables precise control over multiple instances in image generation. MIGC addresses the challenges of generating multiple instances with diverse attributes, positions, and interactions in a single image. The method decomposes the MIG task into subtasks, enhances instance shading using an Enhancement Attention mechanism, and combines results through Layout Attention and Shading Aggregation Controller. The proposed COCO-MIG benchmark evaluates the performance of generation models on MIG tasks, showing significant improvements in position, attribute, and quantity control. Experiments on COCO-MIG, COCO-Position, and DrawBench benchmarks demonstrate that MIGC achieves state-of-the-art results, with high success rates and accurate attribute control. The method maintains inference speed close to the original stable diffusion, making it efficient and practical for real-world applications. The contributions include defining the MIG task, proposing the COCO-MIG benchmark, and introducing the MIGC approach that enhances pre-trained stable diffusion with improved MIG capabilities. The method is evaluated on multiple benchmarks, showing its effectiveness in controlling instance positions, attributes, and interactions.This paper introduces MIGC, a Multi-Instance Generation Controller for Text-to-Image Synthesis, which enables precise control over multiple instances in image generation. MIGC addresses the challenges of generating multiple instances with diverse attributes, positions, and interactions in a single image. The method decomposes the MIG task into subtasks, enhances instance shading using an Enhancement Attention mechanism, and combines results through Layout Attention and Shading Aggregation Controller. The proposed COCO-MIG benchmark evaluates the performance of generation models on MIG tasks, showing significant improvements in position, attribute, and quantity control. Experiments on COCO-MIG, COCO-Position, and DrawBench benchmarks demonstrate that MIGC achieves state-of-the-art results, with high success rates and accurate attribute control. The method maintains inference speed close to the original stable diffusion, making it efficient and practical for real-world applications. The contributions include defining the MIG task, proposing the COCO-MIG benchmark, and introducing the MIGC approach that enhances pre-trained stable diffusion with improved MIG capabilities. The method is evaluated on multiple benchmarks, showing its effectiveness in controlling instance positions, attributes, and interactions.
Reach us at info@study.space
Understanding MIGC%3A Multi-Instance Generation Controller for Text-to-Image Synthesis