Automata-based constraints for language model decoding

Automata-based constraints for language model decoding

2024 | Terry Koo*, Frederick Liu*, and Luheng He
This paper presents a method for constraining language models (LMs) to generate output in formal languages using automata theory. Language models are often expected to generate strings in some formal language, such as structured data, API calls, or code snippets. While LMs can be tuned to improve their adherence to formal syntax, this does not guarantee conformance, especially with smaller LMs suitable for large-scale deployment. To prevent downstream parsing errors, the paper proposes a solution that uses automata theory to derive an efficient closed-form solution for regular languages, a broad class of formal languages with many practical applications, including API calls or schema-guided JSON and YAML. The paper also discusses pragmatic extensions for coping with the issue of high branching factor and extends the techniques to deterministic context-free languages, which similarly admit an efficient closed-form solution. The paper's system compiles constraints 7,000x faster, is provably correct, and can be extended in a modular fashion. The paper also presents a system for constraining LM decoding using push-down automata, which allows for grammar-based constraints. The paper discusses related work, including previous approaches that use automata to constrain learned models, and compares the proposed method with existing approaches. The paper also presents applications of the method, including JSON expressions and Python dataclasses, and discusses the efficiency and correctness of the approach through experiments. The paper concludes that the proposed method provides a clean, efficient, and correct solution for constraining LM decoding using automata theory.This paper presents a method for constraining language models (LMs) to generate output in formal languages using automata theory. Language models are often expected to generate strings in some formal language, such as structured data, API calls, or code snippets. While LMs can be tuned to improve their adherence to formal syntax, this does not guarantee conformance, especially with smaller LMs suitable for large-scale deployment. To prevent downstream parsing errors, the paper proposes a solution that uses automata theory to derive an efficient closed-form solution for regular languages, a broad class of formal languages with many practical applications, including API calls or schema-guided JSON and YAML. The paper also discusses pragmatic extensions for coping with the issue of high branching factor and extends the techniques to deterministic context-free languages, which similarly admit an efficient closed-form solution. The paper's system compiles constraints 7,000x faster, is provably correct, and can be extended in a modular fashion. The paper also presents a system for constraining LM decoding using push-down automata, which allows for grammar-based constraints. The paper discusses related work, including previous approaches that use automata to constrain learned models, and compares the proposed method with existing approaches. The paper also presents applications of the method, including JSON expressions and Python dataclasses, and discusses the efficiency and correctness of the approach through experiments. The paper concludes that the proposed method provides a clean, efficient, and correct solution for constraining LM decoding using automata theory.
Reach us at info@study.space
[slides and audio] Automata-based constraints for language model decoding