EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models

EDA-DM: Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models

26 Sep 2024 | Xuewen Liu, Zhikai Li, Junrui Xiao, Qingyi Gu
This paper proposes EDA-DM, a novel post-training quantization (PTQ) method for diffusion models to address the distribution mismatch issues that hinder their performance in low-bit quantization. Diffusion models, while effective for image generation, face challenges in low-latency applications due to their complex neural networks and dynamic activation distributions across denoising steps. Existing PTQ methods struggle with calibration sample level and reconstruction output level mismatches, leading to suboptimal performance. EDA-DM tackles these issues by introducing two key components: Temporal Distribution Alignment Calibration (TDAC) and Fine-grained Block Reconstruction (FBR). TDAC selects calibration samples based on feature maps in the latent space, aligning them with the overall distribution of samples. FBR modifies the loss function of block-wise reconstruction by incorporating layer-wise losses, aligning the outputs of quantized and full-precision models at different network granularities. Extensive experiments show that EDA-DM significantly outperforms existing PTQ methods across various diffusion models (DDIM, LDM-4, LDM-8, Stable-Diffusion) and datasets (CIFAR-10, LSUN-Bedroom, LSUN-Church, ImageNet, MS-COCO). It achieves state-of-the-art performance in low-bit quantization, demonstrating robustness across different model scales, resolutions, and guidance conditions. EDA-DM also reduces model size and bit operations, making diffusion models more efficient for real-world applications. The method is evaluated using metrics such as FID, sFID, IS, and CLIP score, with results showing significant improvements in generation quality and semantic relevance. Human preference evaluations further confirm the effectiveness of EDA-DM in preserving the quality of generated images.This paper proposes EDA-DM, a novel post-training quantization (PTQ) method for diffusion models to address the distribution mismatch issues that hinder their performance in low-bit quantization. Diffusion models, while effective for image generation, face challenges in low-latency applications due to their complex neural networks and dynamic activation distributions across denoising steps. Existing PTQ methods struggle with calibration sample level and reconstruction output level mismatches, leading to suboptimal performance. EDA-DM tackles these issues by introducing two key components: Temporal Distribution Alignment Calibration (TDAC) and Fine-grained Block Reconstruction (FBR). TDAC selects calibration samples based on feature maps in the latent space, aligning them with the overall distribution of samples. FBR modifies the loss function of block-wise reconstruction by incorporating layer-wise losses, aligning the outputs of quantized and full-precision models at different network granularities. Extensive experiments show that EDA-DM significantly outperforms existing PTQ methods across various diffusion models (DDIM, LDM-4, LDM-8, Stable-Diffusion) and datasets (CIFAR-10, LSUN-Bedroom, LSUN-Church, ImageNet, MS-COCO). It achieves state-of-the-art performance in low-bit quantization, demonstrating robustness across different model scales, resolutions, and guidance conditions. EDA-DM also reduces model size and bit operations, making diffusion models more efficient for real-world applications. The method is evaluated using metrics such as FID, sFID, IS, and CLIP score, with results showing significant improvements in generation quality and semantic relevance. Human preference evaluations further confirm the effectiveness of EDA-DM in preserving the quality of generated images.
Reach us at info@study.space
Understanding Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models