[slides] Chain-of-Action%3A Faithful and Multimodal Question Answering through Large Language Models

The Chain-of-Action (CoA) framework is introduced to enhance the quality of Large Language Models (LLMs) in multimodal and retrieval-augmented Question-Answering (QA). CoA addresses two major challenges in current QA applications: unfaithful hallucination, where responses may not align with real-time or domain-specific facts, and weak reasoning performance over compositional information. The key contribution is a novel reasoning-retrieval mechanism that decomposes complex questions into a reasoning chain via systematic prompting and pre-designed actions. The framework includes three types of domain-adaptable 'Plug-and-Play' actions for retrieving real-time information from heterogeneous sources, such as web text, domain knowledge, and tabular data. A multi-reference faith score (MFRS) is proposed to verify and resolve conflicts in the answers. Empirical results on public benchmarks and a Web3 case study demonstrate the effectiveness of CoA over other methods, showing significant improvements in accuracy, efficiency, and real-world deployment. The framework's ability to handle complex questions and integrate external information makes it a robust solution for enhancing LLMs in QA tasks.The Chain-of-Action (CoA) framework is introduced to enhance the quality of Large Language Models (LLMs) in multimodal and retrieval-augmented Question-Answering (QA). CoA addresses two major challenges in current QA applications: unfaithful hallucination, where responses may not align with real-time or domain-specific facts, and weak reasoning performance over compositional information. The key contribution is a novel reasoning-retrieval mechanism that decomposes complex questions into a reasoning chain via systematic prompting and pre-designed actions. The framework includes three types of domain-adaptable 'Plug-and-Play' actions for retrieving real-time information from heterogeneous sources, such as web text, domain knowledge, and tabular data. A multi-reference faith score (MFRS) is proposed to verify and resolve conflicts in the answers. Empirical results on public benchmarks and a Web3 case study demonstrate the effectiveness of CoA over other methods, showing significant improvements in accuracy, efficiency, and real-world deployment. The framework's ability to handle complex questions and integrate external information makes it a robust solution for enhancing LLMs in QA tasks.

Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

26 Mar 2024 | Zhenyu Pan, Haozheng Luo, Manling Li, Han Liu