Score Distillation Sampling with Learned Manifold Corrective

Score Distillation Sampling with Learned Manifold Corrective

4 Jul 2024 | Thiemo Alldieck, Nikos Kolotouros, and Cristian Sminchisescu
This paper introduces a novel loss function called LMC-SDS (Score Distillation Sampling with Learned Manifold Corrective) to address issues with the original SDS loss in image diffusion models. The SDS loss, used in DreamFusion for text-to-3D synthesis, has been found to produce noisy gradients that lead to oversaturation, repeated details, and other artifacts. The authors analyze the SDS loss and identify that the term $ L_{proj} $, which is responsible for denoising, provides noisy gradients due to a frequency bias in the diffusion model. To mitigate this, they propose a shallow network that learns the frequency bias of the diffusion model and corrects it, resulting in cleaner gradients and better performance. The LMC-SDS loss is shown to be effective in various applications, including optimization-based image synthesis and editing, zero-shot image translation network training, and text-to-3D synthesis. The method is designed to provide meaningful gradients along the learned manifold of real images, reducing the need for high text guidance. The authors demonstrate that their loss formulation leads to higher visual fidelity and more diverse results compared to existing methods. They also show that their approach can be used in 3D asset generation, where it produces more detailed and realistic results than the original SDS loss. The method is implemented with a standard U-Net architecture and is trained on a dataset of real images. The results show that LMC-SDS outperforms other methods in terms of both qualitative and quantitative metrics, including CLIP scores and LPIPS. The paper concludes that LMC-SDS is a significant improvement over the original SDS loss and provides a more stable and effective way to use image diffusion models as image priors.This paper introduces a novel loss function called LMC-SDS (Score Distillation Sampling with Learned Manifold Corrective) to address issues with the original SDS loss in image diffusion models. The SDS loss, used in DreamFusion for text-to-3D synthesis, has been found to produce noisy gradients that lead to oversaturation, repeated details, and other artifacts. The authors analyze the SDS loss and identify that the term $ L_{proj} $, which is responsible for denoising, provides noisy gradients due to a frequency bias in the diffusion model. To mitigate this, they propose a shallow network that learns the frequency bias of the diffusion model and corrects it, resulting in cleaner gradients and better performance. The LMC-SDS loss is shown to be effective in various applications, including optimization-based image synthesis and editing, zero-shot image translation network training, and text-to-3D synthesis. The method is designed to provide meaningful gradients along the learned manifold of real images, reducing the need for high text guidance. The authors demonstrate that their loss formulation leads to higher visual fidelity and more diverse results compared to existing methods. They also show that their approach can be used in 3D asset generation, where it produces more detailed and realistic results than the original SDS loss. The method is implemented with a standard U-Net architecture and is trained on a dataset of real images. The results show that LMC-SDS outperforms other methods in terms of both qualitative and quantitative metrics, including CLIP scores and LPIPS. The paper concludes that LMC-SDS is a significant improvement over the original SDS loss and provides a more stable and effective way to use image diffusion models as image priors.
Reach us at info@study.space
Understanding Score Distillation Sampling with Learned Manifold Corrective