Understanding Unveil Conditional Diffusion Models with Classifier-free Guidance%3A A Sharp Statistical Theory

Conditional diffusion models are fundamental in modern image synthesis and have extensive applications in fields such as computational biology and reinforcement learning. These models incorporate various conditional information to guide the generation process, but their theoretical underpinnings are largely missing. This paper bridges this gap by presenting a sharp statistical theory of distribution estimation using conditional diffusion models. The analysis yields a sample complexity bound that adapts to the smoothness of the data distribution and matches the minimax lower bound. The key to the theoretical development lies in an approximation result for the conditional score function, which relies on a novel diffused Taylor approximation technique. The paper also demonstrates the utility of the statistical theory in elucidating the performance of conditional diffusion models across diverse applications, including model-based transition kernel estimation in reinforcement learning, solving inverse problems, and reward-conditioned sample generation. The contributions of the paper include establishing the first universal approximation theory of conditional score functions using neural networks, providing sample complexity bounds for distribution estimation, and presenting sub-optimality bounds for generating high-reward samples and estimating posterior means in inverse problems.Conditional diffusion models are fundamental in modern image synthesis and have extensive applications in fields such as computational biology and reinforcement learning. These models incorporate various conditional information to guide the generation process, but their theoretical underpinnings are largely missing. This paper bridges this gap by presenting a sharp statistical theory of distribution estimation using conditional diffusion models. The analysis yields a sample complexity bound that adapts to the smoothness of the data distribution and matches the minimax lower bound. The key to the theoretical development lies in an approximation result for the conditional score function, which relies on a novel diffused Taylor approximation technique. The paper also demonstrates the utility of the statistical theory in elucidating the performance of conditional diffusion models across diverse applications, including model-based transition kernel estimation in reinforcement learning, solving inverse problems, and reward-conditioned sample generation. The contributions of the paper include establishing the first universal approximation theory of conditional score functions using neural networks, providing sample complexity bounds for distribution estimation, and presenting sub-optimality bounds for generating high-reward samples and estimating posterior means in inverse problems.

Unveil Conditional Diffusion Models with Classifier-free Guidance: A Sharp Statistical Theory

March 19, 2024 | Hengyu Fu*, Zhuoran Yang†, Mengdi Wang‡, Minshuo Chen‡