24 May 2024 | Tianwei Yin, Michaël Gharbi, Taesung Park, Richard Zhang, Eli Shechtman, Frédo Durand, William T. Freeman
The paper "Improved Distribution Matching Distillation for Fast Image Synthesis" introduces DMD2, an advanced technique for distilling expensive diffusion models into efficient one-step generators. DMD2 addresses the limitations of the original Distribution Matching Distillation (DMD) method by eliminating the need for a regression loss, which was computationally expensive and limited the quality of the distilled model. The key contributions of DMD2 include:
1. **Removing the Regression Loss**: DMD2 eliminates the need for a regression loss, reducing computational costs and allowing for more flexible and scalable training.
2. **Stabilizing Distribution Matching**: A two-time-scale update rule is introduced to stabilize the training of the fake diffusion critic, addressing the instability caused by the removal of the regression loss.
3. **Integrating GAN Loss**: A GAN loss is integrated into the distillation procedure to improve the quality of the distilled model by training it on real data, mitigating the limitations of the teacher model.
4. **Supporting Multi-step Sampling**: DMD2 introduces a new training procedure that enables multi-step sampling in the student model, addressing the training-inference input mismatch by simulating inference-time generator samples during training.
The authors evaluate DMD2 on several benchmarks, including class-conditional image generation on ImageNet-64x64 and text-to-image synthesis on COCO 2014. The results show that DMD2 achieves state-of-the-art performance in one-step image generation, with FID scores of 1.28 on ImageNet-64x64 and 8.35 on zero-shot COCO 2014, surpassing the original teacher model despite a 500× reduction in inference cost. Additionally, DMD2 is shown to generate high-quality megapixel images by distilling from SDXL, demonstrating exceptional visual quality among few-step methods. The authors release their code and pre-trained models to facilitate further research.The paper "Improved Distribution Matching Distillation for Fast Image Synthesis" introduces DMD2, an advanced technique for distilling expensive diffusion models into efficient one-step generators. DMD2 addresses the limitations of the original Distribution Matching Distillation (DMD) method by eliminating the need for a regression loss, which was computationally expensive and limited the quality of the distilled model. The key contributions of DMD2 include:
1. **Removing the Regression Loss**: DMD2 eliminates the need for a regression loss, reducing computational costs and allowing for more flexible and scalable training.
2. **Stabilizing Distribution Matching**: A two-time-scale update rule is introduced to stabilize the training of the fake diffusion critic, addressing the instability caused by the removal of the regression loss.
3. **Integrating GAN Loss**: A GAN loss is integrated into the distillation procedure to improve the quality of the distilled model by training it on real data, mitigating the limitations of the teacher model.
4. **Supporting Multi-step Sampling**: DMD2 introduces a new training procedure that enables multi-step sampling in the student model, addressing the training-inference input mismatch by simulating inference-time generator samples during training.
The authors evaluate DMD2 on several benchmarks, including class-conditional image generation on ImageNet-64x64 and text-to-image synthesis on COCO 2014. The results show that DMD2 achieves state-of-the-art performance in one-step image generation, with FID scores of 1.28 on ImageNet-64x64 and 8.35 on zero-shot COCO 2014, surpassing the original teacher model despite a 500× reduction in inference cost. Additionally, DMD2 is shown to generate high-quality megapixel images by distilling from SDXL, demonstrating exceptional visual quality among few-step methods. The authors release their code and pre-trained models to facilitate further research.