26 Mar 2024 | Yurui Qian, Qi Cai, Yingwei Pan, Yehao Li, Ting Yao, Qibin Sun, and Tao Mei
Diffusion models have revolutionized image generation, but they often rely on the current sample to denoise the next one, leading to denoising instability. This paper introduces Moving Average Sampling in Frequency domain (MASF), a technique that reinterprets the iterative denoising process as model optimization and leverages a moving average mechanism to ensemble all prior samples. Instead of directly applying moving average to denoised samples, MASF first maps these samples to the data space and then performs moving average to avoid distribution shift. Additionally, MASF decomposes samples into different frequency components and executes moving average separately on each component, prioritizing low-frequency components in early timesteps and gradually shifting focus to high-frequency components later. Extensive experiments on unconditional and conditional diffusion models demonstrate that MASF significantly improves performance with minimal additional complexity. The approach is seamlessly integrated into existing diffusion models and sampling schedules, showing superior results compared to baselines.Diffusion models have revolutionized image generation, but they often rely on the current sample to denoise the next one, leading to denoising instability. This paper introduces Moving Average Sampling in Frequency domain (MASF), a technique that reinterprets the iterative denoising process as model optimization and leverages a moving average mechanism to ensemble all prior samples. Instead of directly applying moving average to denoised samples, MASF first maps these samples to the data space and then performs moving average to avoid distribution shift. Additionally, MASF decomposes samples into different frequency components and executes moving average separately on each component, prioritizing low-frequency components in early timesteps and gradually shifting focus to high-frequency components later. Extensive experiments on unconditional and conditional diffusion models demonstrate that MASF significantly improves performance with minimal additional complexity. The approach is seamlessly integrated into existing diffusion models and sampling schedules, showing superior results compared to baselines.