29 Mar 2024 | Shuang Wu, Liwen Zhu, Tao Yang, Shiwei Xu, Qiang Fu, Wei Yang, Haobo Fu
This paper introduces an innovative framework that integrates Large Language Models (LLMs) with an external Thinker module to enhance the reasoning capabilities of LLM-based agents in the game Werewolf. Unlike traditional prompt engineering, the Thinker module directly leverages knowledge from databases and employs optimization techniques to handle complex logical analysis and domain-specific knowledge. The framework is structured into three components: Listener, Thinker, and Presenter. The Listener processes natural language inputs and transforms them into structured features for the Thinker, which specializes in System-2 reasoning tasks. The Presenter generates coherent and contextually appropriate language output guided by the Thinker's strategic instructions. The framework is evaluated using a 9-player Werewolf game, where the Thinker module significantly improves the agents' performance in deductive reasoning, speech generation, and online game evaluation. Additionally, a 6B LLM is fine-tuned to surpass GPT4 when integrated with the Thinker. The paper also contributes the largest dataset for social deduction games to date.This paper introduces an innovative framework that integrates Large Language Models (LLMs) with an external Thinker module to enhance the reasoning capabilities of LLM-based agents in the game Werewolf. Unlike traditional prompt engineering, the Thinker module directly leverages knowledge from databases and employs optimization techniques to handle complex logical analysis and domain-specific knowledge. The framework is structured into three components: Listener, Thinker, and Presenter. The Listener processes natural language inputs and transforms them into structured features for the Thinker, which specializes in System-2 reasoning tasks. The Presenter generates coherent and contextually appropriate language output guided by the Thinker's strategic instructions. The framework is evaluated using a 9-player Werewolf game, where the Thinker module significantly improves the agents' performance in deductive reasoning, speech generation, and online game evaluation. Additionally, a 6B LLM is fine-tuned to surpass GPT4 when integrated with the Thinker. The paper also contributes the largest dataset for social deduction games to date.