CFG++: MANIFOLD-CONSTRAINED CLASSIFIER FREE GUIDANCE FOR DIFFUSION MODELS

CFG++: MANIFOLD-CONSTRAINED CLASSIFIER FREE GUIDANCE FOR DIFFUSION MODELS

12 Jun 2024 | Hyungjin Chung*, Jeongsol Kim, Geon Yeong Park*, Hyelin Nam*, Jong Chul Ye
CFG++: Manifold-Constrained Classifier-Free Guidance for Diffusion Models CFG++ is a novel approach to classifier-free guidance (CFG) for diffusion models, addressing the off-manifold phenomenon that limits the effectiveness of traditional CFG. The paper reveals that the issues with CFG stem from the off-manifold phenomenon rather than inherent limitations of diffusion models. CFG++ reformulates text-guidance as an inverse problem with a text-conditioned score matching loss and develops a simple fix to CFG that significantly improves performance in text-to-image generation, DDIM inversion, and solving inverse problems. CFG++ uses a small guidance scale, typically λ ∈ [0.0, 1.0], to smoothly interpolate between unconditional and conditional sampling, achieving results comparable to traditional CFG with a guidance scale of ω ∼ 12.5 at 50 neural function evaluations (NFE). The paper demonstrates that CFG++ outperforms traditional CFG in text-to-image generation, DDIM inversion, and image editing, and enables the incorporation of CFG guidance into diffusion inverse solvers. The results show that CFG++ significantly enhances performance in text-to-image generation, DDIM inversion, and solving inverse problems, suggesting a wide-ranging impact and potential applications in various fields that utilize text guidance.CFG++: Manifold-Constrained Classifier-Free Guidance for Diffusion Models CFG++ is a novel approach to classifier-free guidance (CFG) for diffusion models, addressing the off-manifold phenomenon that limits the effectiveness of traditional CFG. The paper reveals that the issues with CFG stem from the off-manifold phenomenon rather than inherent limitations of diffusion models. CFG++ reformulates text-guidance as an inverse problem with a text-conditioned score matching loss and develops a simple fix to CFG that significantly improves performance in text-to-image generation, DDIM inversion, and solving inverse problems. CFG++ uses a small guidance scale, typically λ ∈ [0.0, 1.0], to smoothly interpolate between unconditional and conditional sampling, achieving results comparable to traditional CFG with a guidance scale of ω ∼ 12.5 at 50 neural function evaluations (NFE). The paper demonstrates that CFG++ outperforms traditional CFG in text-to-image generation, DDIM inversion, and image editing, and enables the incorporation of CFG guidance into diffusion inverse solvers. The results show that CFG++ significantly enhances performance in text-to-image generation, DDIM inversion, and solving inverse problems, suggesting a wide-ranging impact and potential applications in various fields that utilize text guidance.
Reach us at info@study.space
[slides] CFG%2B%2B%3A Manifold-constrained Classifier Free Guidance for Diffusion Models | StudySpace