Lean-STaR: Learning to Interleave Thinking and Proving

Lean-STaR: Learning to Interleave Thinking and Proving

8 Aug 2024 | Haohan Lin, Zhiqing Sun, Yiming Yang, Sean Welleck
Lean-STaR is a framework that enhances the theorem-proving capabilities of language models by integrating informal thoughts into the formal proving process. The key idea is that informal thoughts, which are not present in formal proofs, can significantly improve the model's ability to generate correct proofs. Lean-STaR uses retrospective ground-truth tactics to generate synthetic thoughts for training the language model. At inference time, the model generates thoughts prior to each step of the proof. The framework is built on the Self-Taught Reasoner (STaR) approach and incorporates expert iteration to further fine-tune the model on correct proofs. Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark, significantly outperforming base models. The framework generates natural language thoughts for each proof step, improving formal proving capabilities by interleaving natural and formal languages. The method involves generating thought-augmented examples from Lean's Mathlib, followed by two iterations of expert iteration to refine the model. The final model, Lean-STaR, achieves a pass@64 of 36.1% on the miniF2F-test benchmark. The work presents a new link between informal and formal mathematics, offering a framework for teaching language models to interleave informal thoughts with formal verification, advancing the capabilities of language models in automated theorem proving.Lean-STaR is a framework that enhances the theorem-proving capabilities of language models by integrating informal thoughts into the formal proving process. The key idea is that informal thoughts, which are not present in formal proofs, can significantly improve the model's ability to generate correct proofs. Lean-STaR uses retrospective ground-truth tactics to generate synthetic thoughts for training the language model. At inference time, the model generates thoughts prior to each step of the proof. The framework is built on the Self-Taught Reasoner (STaR) approach and incorporates expert iteration to further fine-tune the model on correct proofs. Lean-STaR achieves state-of-the-art results on the miniF2F-test benchmark, significantly outperforming base models. The framework generates natural language thoughts for each proof step, improving formal proving capabilities by interleaving natural and formal languages. The method involves generating thought-augmented examples from Lean's Mathlib, followed by two iterations of expert iteration to refine the model. The final model, Lean-STaR, achieves a pass@64 of 36.1% on the miniF2F-test benchmark. The work presents a new link between informal and formal mathematics, offering a framework for teaching language models to interleave informal thoughts with formal verification, advancing the capabilities of language models in automated theorem proving.
Reach us at info@study.space