INITNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization

INITNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization

6 Apr 2024 | Xiefan Guo, Jinlin Liu, Miaomiao Cui, Jiankai Li, Hongyu Yang, Di Huang
The paper "INITNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization" addresses the challenge of aligning generated images with text prompts in diffusion models, particularly Stable Diffusion. The authors identify that invalid initial noise is a root cause of this issue and propose Initial Noise Optimization (InitNO) to refine this noise. InitNO involves partitioning the initial latent space into valid and invalid regions using cross-attention response scores and self-attention conflict scores. A noise optimization pipeline is developed to guide the initial noise towards valid regions, ensuring semantically accurate image generation. The method is validated through rigorous experiments, demonstrating superior performance in generating images that strictly adhere to text prompts. The code for InitNO is available at <https://github.com/xiefan-guo/initno>.The paper "INITNO: Boosting Text-to-Image Diffusion Models via Initial Noise Optimization" addresses the challenge of aligning generated images with text prompts in diffusion models, particularly Stable Diffusion. The authors identify that invalid initial noise is a root cause of this issue and propose Initial Noise Optimization (InitNO) to refine this noise. InitNO involves partitioning the initial latent space into valid and invalid regions using cross-attention response scores and self-attention conflict scores. A noise optimization pipeline is developed to guide the initial noise towards valid regions, ensuring semantically accurate image generation. The method is validated through rigorous experiments, demonstrating superior performance in generating images that strictly adhere to text prompts. The code for InitNO is available at <https://github.com/xiefan-guo/initno>.
Reach us at info@study.space
[slides and audio] Initno%3A Boosting Text-to-Image Diffusion Models via Initial Noise Optimization