Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

26 Mar 2024 | Zhenyu Pan, Haozheng Luo, Manling Li, Han Liu
This paper introduces a Chain-of-Action (CoA) framework for multimodal and retrieval-augmented question-answering (QA). The framework addresses two major challenges in current QA applications: unfaithful hallucination, where responses may not align with real-time or domain-specific facts, and weak reasoning performance over compositional information. The key contribution is a novel reasoning-retrieval mechanism that decomposes complex questions into reasoning chains through systematic prompting and pre-designed actions. Three types of domain-adaptable 'Plug-and-Play' actions are proposed for retrieving real-time information from heterogeneous sources. A multi-reference faith score (MRFS) is introduced to verify and resolve conflicts in answers. Empirical results show that CoA outperforms existing methods on public benchmarks and a Web3 case study. The framework is designed to be extensible to diverse modalities and supports real-time information retrieval. The CoA framework enhances faithfulness and multi-step reasoning by integrating information retrieval into the reasoning process. The framework is evaluated on multiple QA datasets and demonstrates superior performance in both information retrieval and non-retrieval scenarios. The framework is also applied to a real-world Web3 QA product, showing significant user engagement and positive feedback. The CoA framework is efficient, robust, and effective in handling complex tasks, particularly in scenarios requiring real-time or domain-specific information. The framework's ability to handle complex tasks and reduce information conflicts makes it a promising approach for future QA systems.This paper introduces a Chain-of-Action (CoA) framework for multimodal and retrieval-augmented question-answering (QA). The framework addresses two major challenges in current QA applications: unfaithful hallucination, where responses may not align with real-time or domain-specific facts, and weak reasoning performance over compositional information. The key contribution is a novel reasoning-retrieval mechanism that decomposes complex questions into reasoning chains through systematic prompting and pre-designed actions. Three types of domain-adaptable 'Plug-and-Play' actions are proposed for retrieving real-time information from heterogeneous sources. A multi-reference faith score (MRFS) is introduced to verify and resolve conflicts in answers. Empirical results show that CoA outperforms existing methods on public benchmarks and a Web3 case study. The framework is designed to be extensible to diverse modalities and supports real-time information retrieval. The CoA framework enhances faithfulness and multi-step reasoning by integrating information retrieval into the reasoning process. The framework is evaluated on multiple QA datasets and demonstrates superior performance in both information retrieval and non-retrieval scenarios. The framework is also applied to a real-world Web3 QA product, showing significant user engagement and positive feedback. The CoA framework is efficient, robust, and effective in handling complex tasks, particularly in scenarios requiring real-time or domain-specific information. The framework's ability to handle complex tasks and reduce information conflicts makes it a promising approach for future QA systems.
Reach us at info@study.space