[slides] Reasoning with Large Language Models%2C a Survey

The paper "Reasoning with Large Language Models, a Survey" by Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, and Thomas Bäck provides an in-depth review of the rapidly expanding field of prompt-based reasoning with large language models (LLMs). The authors highlight the breakthrough performance of LLMs on various language tasks, such as translation, summarization, and question-answering, and discuss the recent advances in Chain-of-thought prompt learning, which have demonstrated strong "System 2" reasoning abilities. The paper begins by introducing the concept of in-context learning, a form of few-shot learning where LLMs perform inference-time reasoning without explicit parameter tuning. It then reviews the progress in LLMs, including the development of benchmarks like GSM8K, ASDiv, MAWPS, SVAMP, and AQuA, which are used to evaluate LLMs' reasoning capabilities. The authors present a taxonomy of approaches for generating, evaluating, and controlling multi-step reasoning, including handcrafted prompts, prompts using external knowledge, and model-generated prompts. They discuss various methods for evaluating the results of reasoning steps, such as self-assessment, tool-based validation, and external model validation. The paper also explores the use of formal languages, such as Python, for reasoning and the application of LLMs to robotic behavior planning. Finally, the authors propose a research agenda for future work, emphasizing the potential for self-improvement and self-reasoning through judicious use of prompts. They conclude by discussing the connections between reasoning, sequential decision processes, and reinforcement learning, and the broader implications for artificial general intelligence.The paper "Reasoning with Large Language Models, a Survey" by Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, and Thomas Bäck provides an in-depth review of the rapidly expanding field of prompt-based reasoning with large language models (LLMs). The authors highlight the breakthrough performance of LLMs on various language tasks, such as translation, summarization, and question-answering, and discuss the recent advances in Chain-of-thought prompt learning, which have demonstrated strong "System 2" reasoning abilities. The paper begins by introducing the concept of in-context learning, a form of few-shot learning where LLMs perform inference-time reasoning without explicit parameter tuning. It then reviews the progress in LLMs, including the development of benchmarks like GSM8K, ASDiv, MAWPS, SVAMP, and AQuA, which are used to evaluate LLMs' reasoning capabilities. The authors present a taxonomy of approaches for generating, evaluating, and controlling multi-step reasoning, including handcrafted prompts, prompts using external knowledge, and model-generated prompts. They discuss various methods for evaluating the results of reasoning steps, such as self-assessment, tool-based validation, and external model validation. The paper also explores the use of formal languages, such as Python, for reasoning and the application of LLMs to robotic behavior planning. Finally, the authors propose a research agenda for future work, emphasizing the potential for self-improvement and self-reasoning through judicious use of prompts. They conclude by discussing the connections between reasoning, sequential decision processes, and reinforcement learning, and the broader implications for artificial general intelligence.

Reasoning with Large Language Models, a Survey

July 17, 2024 | Aske Plaat, Annie Wong, Suzan Verberne, Joost Broekens, Niki van Stein, Thomas Bäck