Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models

Diffusion of Thought: Chain-of-Thought Reasoning in Diffusion Language Models

15 Jul 2024 | Jiacheng Ye1; Shansan Gong1; Liheng Chen1; Lin Zheng1, Jiahui Gao2, Han Shi2, Chuan Wu1, Xin Jiang2, Zhenguo Li2, Wei Bi3, Lingpeng Kong1
This paper introduces Diffusion-of-Thought (DoT), a novel approach that integrates diffusion models with Chain-of-Thought (CoT) reasoning. DoT allows reasoning steps to diffuse over time through a diffusion language model, offering greater flexibility in trading computation for reasoning performance. Experimental results demonstrate the effectiveness of DoT in multi-digit multiplication, boolean logic, and grade school math problems, with a small diffusion model outperforming a much larger autoregressive model in both efficiency and accuracy. DoT also showcases promising self-correction abilities and benefits from existing reasoning-enhancing techniques like self-consistency decoding. The findings contribute to the understanding and development of reasoning with diffusion language models.This paper introduces Diffusion-of-Thought (DoT), a novel approach that integrates diffusion models with Chain-of-Thought (CoT) reasoning. DoT allows reasoning steps to diffuse over time through a diffusion language model, offering greater flexibility in trading computation for reasoning performance. Experimental results demonstrate the effectiveness of DoT in multi-digit multiplication, boolean logic, and grade school math problems, with a small diffusion model outperforming a much larger autoregressive model in both efficiency and accuracy. DoT also showcases promising self-correction abilities and benefits from existing reasoning-enhancing techniques like self-consistency decoding. The findings contribute to the understanding and development of reasoning with diffusion language models.
Reach us at info@study.space
[slides] Diffusion of Thoughts%3A Chain-of-Thought Reasoning in Diffusion Language Models | StudySpace