Graph-enhanced Large Language Models in Asynchronous Plan Reasoning

Graph-enhanced Large Language Models in Asynchronous Plan Reasoning

2024 | Fangru Lin, Emanuele La Malfa, Valentin Hofmann, Elle Michelle Yang, Anthony G. Cohn, Janet B. Pierrehumbert
This paper introduces a novel approach called Plan Like a Graph (PLaG) to enhance the performance of large language models (LLMs) in asynchronous plan reasoning. The study evaluates the ability of LLMs to solve complex planning tasks involving both sequential and parallel actions. The researchers developed a benchmark called AsyncHow, containing 1.6K high-quality instances for real-life tasks. They found that existing LLMs, including GPT-4 and LLaMA-2, perform poorly without detailed solution illustrations. PLaG, which combines graphs with natural language prompts, significantly improves performance across all task complexities. However, even with PLaG, LLMs still struggle with highly complex planning tasks, indicating limitations in their ability to simulate digital devices. The study highlights the potential of PLaG as a method to consistently boost SOTA model performance in planning tasks. The results show that while PLaG enhances performance, LLMs still face challenges with increasing task complexity. The paper also discusses the implications of these findings for future research on enhancing conceptual representations in LLMs. The benchmark and code are available for further research.This paper introduces a novel approach called Plan Like a Graph (PLaG) to enhance the performance of large language models (LLMs) in asynchronous plan reasoning. The study evaluates the ability of LLMs to solve complex planning tasks involving both sequential and parallel actions. The researchers developed a benchmark called AsyncHow, containing 1.6K high-quality instances for real-life tasks. They found that existing LLMs, including GPT-4 and LLaMA-2, perform poorly without detailed solution illustrations. PLaG, which combines graphs with natural language prompts, significantly improves performance across all task complexities. However, even with PLaG, LLMs still struggle with highly complex planning tasks, indicating limitations in their ability to simulate digital devices. The study highlights the potential of PLaG as a method to consistently boost SOTA model performance in planning tasks. The results show that while PLaG enhances performance, LLMs still face challenges with increasing task complexity. The paper also discusses the implications of these findings for future research on enhancing conceptual representations in LLMs. The benchmark and code are available for further research.
Reach us at info@study.space
[slides] Graph-enhanced Large Language Models in Asynchronous Plan Reasoning | StudySpace