[slides] Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration

The paper addresses the challenge of grounding large language models (LLMs) for embodied multi-agent collaboration, which is crucial due to the complexity of the physical world. Existing methods often rely on physical verification or self-reflection, leading to inefficient querying of LLMs. To overcome this, the authors propose Reinforced Advantage feedback (ReAd), a novel framework that introduces a sequential advantage function to refine LLM-generated plans. ReAd uses critic regression to learn this advantage function from LLM-planned data and treats the LLM planner as an optimizer to generate actions that maximize the advantage function. This approach endows the LLM with foresight to assess the contribution of actions to the final task. The paper provides theoretical analysis by extending advantage-weighted regression in reinforcement learning to multi-agent systems. Experiments on Overcooked-AI and a difficult variant of RoCoBench demonstrate that ReAd significantly reduces interaction steps and query rounds while improving success rates, showcasing its efficiency and effectiveness in grounding LLMs for embodied multi-agent collaboration.The paper addresses the challenge of grounding large language models (LLMs) for embodied multi-agent collaboration, which is crucial due to the complexity of the physical world. Existing methods often rely on physical verification or self-reflection, leading to inefficient querying of LLMs. To overcome this, the authors propose Reinforced Advantage feedback (ReAd), a novel framework that introduces a sequential advantage function to refine LLM-generated plans. ReAd uses critic regression to learn this advantage function from LLM-planned data and treats the LLM planner as an optimizer to generate actions that maximize the advantage function. This approach endows the LLM with foresight to assess the contribution of actions to the final task. The paper provides theoretical analysis by extending advantage-weighted regression in reinforcement learning to multi-agent systems. Experiments on Overcooked-AI and a difficult variant of RoCoBench demonstrate that ReAd significantly reduces interaction steps and query rounds while improving success rates, showcasing its efficiency and effectiveness in grounding LLMs for embodied multi-agent collaboration.

Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration

26 May 2024 | Yang Zhang, Shixin Yang, Chenjia Bai, Fei Wu, Xiu Li, Zhen Wang, Xuelong Li