Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models

Good Seed Makes a Good Crop: Discovering Secret Seeds in Text-to-Image Diffusion Models

23 May 2024 | Katherine Xu, Lingzhi Zhang, Jianbo Shi
This paper investigates the impact of random seeds in text-to-image (T2I) diffusion models, revealing that seeds significantly influence image quality, style, layout, and the generation of text artifacts. The study shows that certain "golden" seeds produce high-quality images with better alignment to human preferences, while others generate inferior results. A 1,024-way classifier achieved over 99.9% accuracy in predicting the seed used to generate an image, demonstrating that seeds are highly distinguishable based on the generated images. The research also identifies that seeds control interpretable visual dimensions, such as grayscale images, prominent sky regions, and image borders. Seeds affect image composition, including object location, size, and depth. By leveraging these "golden" seeds, the study demonstrates improved image generation, including high-fidelity inference and diversified sampling. The findings extend to inpainting tasks, where certain seeds tend to insert unwanted text artifacts. The study provides practical applications for enhancing image generation, such as using top-K seeds for high-fidelity inference and diversifying sampling based on style or layout. The research highlights the importance of selecting good seeds and offers a practical utility for image generation. The study also discusses related work, including the impact of stochasticity in deep learning models and the optimization of initial noise in diffusion models. The paper concludes with a discussion of the limitations and broader impacts of the findings.This paper investigates the impact of random seeds in text-to-image (T2I) diffusion models, revealing that seeds significantly influence image quality, style, layout, and the generation of text artifacts. The study shows that certain "golden" seeds produce high-quality images with better alignment to human preferences, while others generate inferior results. A 1,024-way classifier achieved over 99.9% accuracy in predicting the seed used to generate an image, demonstrating that seeds are highly distinguishable based on the generated images. The research also identifies that seeds control interpretable visual dimensions, such as grayscale images, prominent sky regions, and image borders. Seeds affect image composition, including object location, size, and depth. By leveraging these "golden" seeds, the study demonstrates improved image generation, including high-fidelity inference and diversified sampling. The findings extend to inpainting tasks, where certain seeds tend to insert unwanted text artifacts. The study provides practical applications for enhancing image generation, such as using top-K seeds for high-fidelity inference and diversifying sampling based on style or layout. The research highlights the importance of selecting good seeds and offers a practical utility for image generation. The study also discusses related work, including the impact of stochasticity in deep learning models and the optimization of initial noise in diffusion models. The paper concludes with a discussion of the limitations and broader impacts of the findings.
Reach us at info@study.space
[slides] Good Seed Makes a Good Crop%3A Discovering Secret Seeds in Text-to-Image Diffusion Models | StudySpace