Understanding Diffusion Models Are Innate One-Step Generators

Diffusion Models (DMs) have achieved significant success in image generation, but their multi-step sampling process is computationally expensive. Instance-based distillation methods have been proposed to create one-step generators by having a simpler student model mimic a more complex teacher model. However, these methods face limitations due to the different local minima between the teacher and student models, leading to suboptimal performance. To address this, the authors introduce a novel distributional distillation method that uses an exclusive distributional loss. This method significantly reduces the number of training images required while achieving state-of-the-art (SOTA) results. The authors also demonstrate that DM layers are differentially activated at different time steps, suggesting an inherent capability for one-step generation. Freezing most of the convolutional layers in a DM during distillation further enhances performance. The proposed method, GDD (GAN Distillation at Distribution Level), achieves SOTA results on datasets such as CIFAR-10, AFHQv2 64x64, FFHQ 64x64, and ImageNet 64x64 with only 5 million training images and minimal computational resources. The method's effectiveness is further validated through ablation studies and comparisons with other instance-based distillation methods. The authors conclude that DMs are inherently capable of one-step generation, and their approach not only improves efficiency but also provides valuable insights into the field of diffusion distillation.Diffusion Models (DMs) have achieved significant success in image generation, but their multi-step sampling process is computationally expensive. Instance-based distillation methods have been proposed to create one-step generators by having a simpler student model mimic a more complex teacher model. However, these methods face limitations due to the different local minima between the teacher and student models, leading to suboptimal performance. To address this, the authors introduce a novel distributional distillation method that uses an exclusive distributional loss. This method significantly reduces the number of training images required while achieving state-of-the-art (SOTA) results. The authors also demonstrate that DM layers are differentially activated at different time steps, suggesting an inherent capability for one-step generation. Freezing most of the convolutional layers in a DM during distillation further enhances performance. The proposed method, GDD (GAN Distillation at Distribution Level), achieves SOTA results on datasets such as CIFAR-10, AFHQv2 64x64, FFHQ 64x64, and ImageNet 64x64 with only 5 million training images and minimal computational resources. The method's effectiveness is further validated through ablation studies and comparisons with other instance-based distillation methods. The authors conclude that DMs are inherently capable of one-step generation, and their approach not only improves efficiency but also provides valuable insights into the field of diffusion distillation.

DIFFUSION MODELS ARE INNATE ONE-STEP GENERATORS

7 Jun 2024 | Bowen Zheng, Tianming Yang