Guiding a Diffusion Model with a Bad Version of Itself

Guiding a Diffusion Model with a Bad Version of Itself

4 Jun 2024 | Tero Karras, Miika Aittala, Tuomas Kynkäänniemi, Jaakko Lehtinen, Timo Aila, Samuli Laine
This paper introduces a novel method called autoguidance for improving image quality in diffusion models. Unlike traditional classifier-free guidance (CFG), which uses an unconditional model to guide a conditional model, autoguidance uses a smaller, less-trained version of the same model as the guiding model. This approach allows for better control over image quality without sacrificing variation. The method is validated on ImageNet-512 and ImageNet-64, achieving record FID scores of 1.01 and 1.25, respectively. The method also works for unconditional diffusion models, significantly improving their quality. The key insight is that using a weaker version of the model as the guiding model allows for disentangled control over image quality and variation. The method is implemented using publicly available networks and is shown to be effective in both synthetic and practical image synthesis scenarios. The results demonstrate that autoguidance outperforms CFG in terms of image quality and diversity, and the method is applicable to a wide range of diffusion models.This paper introduces a novel method called autoguidance for improving image quality in diffusion models. Unlike traditional classifier-free guidance (CFG), which uses an unconditional model to guide a conditional model, autoguidance uses a smaller, less-trained version of the same model as the guiding model. This approach allows for better control over image quality without sacrificing variation. The method is validated on ImageNet-512 and ImageNet-64, achieving record FID scores of 1.01 and 1.25, respectively. The method also works for unconditional diffusion models, significantly improving their quality. The key insight is that using a weaker version of the model as the guiding model allows for disentangled control over image quality and variation. The method is implemented using publicly available networks and is shown to be effective in both synthetic and practical image synthesis scenarios. The results demonstrate that autoguidance outperforms CFG in terms of image quality and diversity, and the method is applicable to a wide range of diffusion models.
Reach us at info@study.space
[slides] Guiding a Diffusion Model with a Bad Version of Itself | StudySpace