26 Apr 2024 | Lucas Lehnert, Sainbayar Sukhbaatar, DiJia Su, Qingqing Zheng, Paul Mcvay, Michael Rabbat, Yuandong Tian
This paper introduces Searchformer, a Transformer-based model trained to solve complex planning tasks by learning from the search dynamics of the A* algorithm. The model is trained to predict the search steps of A* and then fine-tuned to find optimal plans with fewer steps. The model outperforms baselines in terms of both model size and training data, and is able to solve 93.7% of Sokoban puzzles with 26.8% fewer search steps than the A* implementation used for training. The model is also shown to scale well to more complex tasks, with improved performance and shorter search dynamics. The paper demonstrates that by incorporating execution traces into the training data, Transformers can learn to imitate the search process of A* and discover more efficient heuristics through fine-tuning. The results show that Searchformer can solve complex planning tasks more efficiently than traditional symbolic planners, and that Transformers can be used to improve planning and reasoning capabilities in a variety of tasks. The paper also highlights the importance of training data in improving the performance of Transformers on planning tasks, and shows that including execution traces can significantly enhance the model's ability to generate optimal plans. The study provides a new approach to training Transformers for planning tasks, and demonstrates the potential of Transformers in solving complex decision-making problems.This paper introduces Searchformer, a Transformer-based model trained to solve complex planning tasks by learning from the search dynamics of the A* algorithm. The model is trained to predict the search steps of A* and then fine-tuned to find optimal plans with fewer steps. The model outperforms baselines in terms of both model size and training data, and is able to solve 93.7% of Sokoban puzzles with 26.8% fewer search steps than the A* implementation used for training. The model is also shown to scale well to more complex tasks, with improved performance and shorter search dynamics. The paper demonstrates that by incorporating execution traces into the training data, Transformers can learn to imitate the search process of A* and discover more efficient heuristics through fine-tuning. The results show that Searchformer can solve complex planning tasks more efficiently than traditional symbolic planners, and that Transformers can be used to improve planning and reasoning capabilities in a variety of tasks. The paper also highlights the importance of training data in improving the performance of Transformers on planning tasks, and shows that including execution traces can significantly enhance the model's ability to generate optimal plans. The study provides a new approach to training Transformers for planning tasks, and demonstrates the potential of Transformers in solving complex decision-making problems.