System-1.x: Learning to Balance Fast and Slow Planning with Language Models

System-1.x: Learning to Balance Fast and Slow Planning with Language Models

19 Jul 2024 | Swarnadeep Saha, Archiki Prasad, Justin Chih-Yao Chen, Peter Hase, Elias Stengel-Eskin, Mohit Bansal
System-1.x: Learning to Balance Fast and Slow Planning with Language Models This paper introduces System-1.x, a controllable planning framework that combines fast (System-1) and slow (System-2) planning modes using language models. System-1.x is designed to generate hybrid plans that balance between the two planning modes based on the difficulty of the problem. The framework consists of three components: a controller, a System-1 Planner, and a System-2 Planner. The controller decomposes a planning problem into sub-goals and classifies them as easy or hard, determining which planner to use for each sub-goal. The System-1 Planner generates plans for easy sub-goals without explicit search, while the System-2 Planner handles harder sub-goals through explicit search. The controller is trained using a user-specified hybridization factor x, which determines the balance between the two planning modes. System-1.x is trained on a single base language model and requires only search traces as supervision. Experiments on two planning tasks, Maze Navigation and Blocksworld, show that System-1.x outperforms both System-1 and System-2 planners, as well as a symbolic planner (A* search), given an exploration budget. The planner is controllable, flexible, and generalizable, allowing for adjustments in the hybridization factor x to balance between the two planning modes. System-1.x also demonstrates robustness to different search algorithms, such as BFS, DFS, and A*, and can be converted into a neuro-symbolic planner by replacing the System-2 Planner with a symbolic solver. The paper also discusses the limitations of System-1.x, including its focus on fully-observable and deterministic domains, and the potential for variable performance when all sub-goals are easy or all hard. The framework is evaluated on two classical planning tasks, Maze Navigation and Blocksworld, and shows that it outperforms other planners in terms of accuracy and efficiency. The results highlight the effectiveness of System-1.x in balancing fast and slow planning modes, making it a promising approach for long-horizon planning with language models.System-1.x: Learning to Balance Fast and Slow Planning with Language Models This paper introduces System-1.x, a controllable planning framework that combines fast (System-1) and slow (System-2) planning modes using language models. System-1.x is designed to generate hybrid plans that balance between the two planning modes based on the difficulty of the problem. The framework consists of three components: a controller, a System-1 Planner, and a System-2 Planner. The controller decomposes a planning problem into sub-goals and classifies them as easy or hard, determining which planner to use for each sub-goal. The System-1 Planner generates plans for easy sub-goals without explicit search, while the System-2 Planner handles harder sub-goals through explicit search. The controller is trained using a user-specified hybridization factor x, which determines the balance between the two planning modes. System-1.x is trained on a single base language model and requires only search traces as supervision. Experiments on two planning tasks, Maze Navigation and Blocksworld, show that System-1.x outperforms both System-1 and System-2 planners, as well as a symbolic planner (A* search), given an exploration budget. The planner is controllable, flexible, and generalizable, allowing for adjustments in the hybridization factor x to balance between the two planning modes. System-1.x also demonstrates robustness to different search algorithms, such as BFS, DFS, and A*, and can be converted into a neuro-symbolic planner by replacing the System-2 Planner with a symbolic solver. The paper also discusses the limitations of System-1.x, including its focus on fully-observable and deterministic domains, and the potential for variable performance when all sub-goals are easy or all hard. The framework is evaluated on two classical planning tasks, Maze Navigation and Blocksworld, and shows that it outperforms other planners in terms of accuracy and efficiency. The results highlight the effectiveness of System-1.x in balancing fast and slow planning modes, making it a promising approach for long-horizon planning with language models.
Reach us at info@study.space
[slides and audio] System-1.x%3A Learning to Balance Fast and Slow Planning with Language Models