24 May 2024 | Haoxuan Chen*, Yinuo Ren*, Lexing Ying, Grant M. Rotskoff
Diffusion models have become a leading method for generative modeling, but their high computational cost during training and evaluation remains a significant challenge. This paper proposes a novel approach to accelerate the inference process of diffusion models by dividing the sampling process into $\mathcal{O}(1)$ blocks, each containing parallelizable Picard iterations. The authors provide a rigorous theoretical analysis, showing that their algorithm achieves a time complexity of $\mathcal{O}(\text{poly log } d)$, marking the first implementation with provable sub-linear complexity relative to the data dimension $d$. This approach is compatible with both SDE and probability flow ODE implementations and offers significant improvements over current state-of-the-art methods. The paper also discusses the contributions, preliminaries, and parallel sampling techniques, highlighting the potential for fast and efficient sampling of high-dimensional data on modern GPU clusters.Diffusion models have become a leading method for generative modeling, but their high computational cost during training and evaluation remains a significant challenge. This paper proposes a novel approach to accelerate the inference process of diffusion models by dividing the sampling process into $\mathcal{O}(1)$ blocks, each containing parallelizable Picard iterations. The authors provide a rigorous theoretical analysis, showing that their algorithm achieves a time complexity of $\mathcal{O}(\text{poly log } d)$, marking the first implementation with provable sub-linear complexity relative to the data dimension $d$. This approach is compatible with both SDE and probability flow ODE implementations and offers significant improvements over current state-of-the-art methods. The paper also discusses the contributions, preliminaries, and parallel sampling techniques, highlighting the potential for fast and efficient sampling of high-dimensional data on modern GPU clusters.