Convergence Analysis of Discrete Diffusion Model: Exact Implementation through Uniformization

Convergence Analysis of Discrete Diffusion Model: Exact Implementation through Uniformization

February 15, 2024 | Hongrui Chen*, Lexing Ying†
This paper presents a theoretical analysis of discrete diffusion models, focusing on their convergence properties and the implementation of the reversed process using uniformization. The discrete diffusion model is formulated as a Continuous Time Markov Chain (CTMC) on a discrete state space, allowing for the modeling of inherently discrete data such as language and graphs. The paper introduces an algorithm that leverages the uniformization technique to implement transitions on random time points, enabling exact simulation of the reversed CTMC without discretization errors. This approach provides a significant advantage over the SDE framework, where discretization errors are prevalent. The paper derives Total Variation (TV) distance and KL divergence guarantees for sampling from any distribution on a hypercube, aligning with state-of-the-art results for diffusion models in $ \mathbb{R}^d $. It shows that the discrete diffusion model can achieve a logarithmic dependence on $ \epsilon $, in contrast to the $ \epsilon^{-2} $ dependence in the SDE framework. The algorithm is analyzed for efficiency, with a complexity of $ O(d \log(d/\epsilon\delta)) $ steps to reach a distribution $ \epsilon $-close in KL divergence to a distribution $ p_\delta $, where $ p_\delta $ is a $ \delta $-perturbation of the data distribution in TV distance. The paper also discusses the convergence properties of the forward process, showing exponential convergence to the uniform distribution over $ \{0,1\}^d $. The discrete score function is defined as the ratio of probabilities between neighboring states, and the algorithm uses a score estimator to approximate this function. The analysis shows that the algorithm can achieve a TV distance bound of $ \tilde{O}(\sqrt{\epsilon}) $, which is nearly optimal for the discrete diffusion model. The results demonstrate that the discrete diffusion model can be implemented efficiently and accurately, with theoretical guarantees that match the best results for continuous diffusion models.This paper presents a theoretical analysis of discrete diffusion models, focusing on their convergence properties and the implementation of the reversed process using uniformization. The discrete diffusion model is formulated as a Continuous Time Markov Chain (CTMC) on a discrete state space, allowing for the modeling of inherently discrete data such as language and graphs. The paper introduces an algorithm that leverages the uniformization technique to implement transitions on random time points, enabling exact simulation of the reversed CTMC without discretization errors. This approach provides a significant advantage over the SDE framework, where discretization errors are prevalent. The paper derives Total Variation (TV) distance and KL divergence guarantees for sampling from any distribution on a hypercube, aligning with state-of-the-art results for diffusion models in $ \mathbb{R}^d $. It shows that the discrete diffusion model can achieve a logarithmic dependence on $ \epsilon $, in contrast to the $ \epsilon^{-2} $ dependence in the SDE framework. The algorithm is analyzed for efficiency, with a complexity of $ O(d \log(d/\epsilon\delta)) $ steps to reach a distribution $ \epsilon $-close in KL divergence to a distribution $ p_\delta $, where $ p_\delta $ is a $ \delta $-perturbation of the data distribution in TV distance. The paper also discusses the convergence properties of the forward process, showing exponential convergence to the uniform distribution over $ \{0,1\}^d $. The discrete score function is defined as the ratio of probabilities between neighboring states, and the algorithm uses a score estimator to approximate this function. The analysis shows that the algorithm can achieve a TV distance bound of $ \tilde{O}(\sqrt{\epsilon}) $, which is nearly optimal for the discrete diffusion model. The results demonstrate that the discrete diffusion model can be implemented efficiently and accurately, with theoretical guarantees that match the best results for continuous diffusion models.
Reach us at info@study.space
[slides and audio] Convergence Analysis of Discrete Diffusion Model%3A Exact Implementation through Uniformization