The paper introduces AMOR, an adaptable modular knowledge agent designed to handle complex tasks by reasoning with external knowledge bases and adapting to specific domains through human supervision. AMOR is built on open-source large language models (LLMs) and uses a finite state machine (FSM) to structure its reasoning process, allowing for modular execution and transitions over disentangled modules. This design enables process feedback, where humans can provide direct feedback to individual modules, enhancing the agent's performance.
The training of AMOR consists of two stages: warm-up and adaptation. The warm-up stage involves fine-tuning the LLMs using examples constructed from various public datasets, enabling AMOR to generalize across different knowledge environments. The adaptation stage tailors AMOR to specific domains by collecting process feedback during autonomous execution and further fine-tuning based on this feedback.
Experiments across multiple domains demonstrate that AMOR outperforms strong baselines, thanks to its FSM-based reasoning and process feedback mechanism. The paper also discusses the effectiveness of process feedback in adaptation compared to outcome feedback, highlighting its superior data efficiency and adaptability. Future work will explore extending AMOR to handle more types of knowledge and broader agent tasks.The paper introduces AMOR, an adaptable modular knowledge agent designed to handle complex tasks by reasoning with external knowledge bases and adapting to specific domains through human supervision. AMOR is built on open-source large language models (LLMs) and uses a finite state machine (FSM) to structure its reasoning process, allowing for modular execution and transitions over disentangled modules. This design enables process feedback, where humans can provide direct feedback to individual modules, enhancing the agent's performance.
The training of AMOR consists of two stages: warm-up and adaptation. The warm-up stage involves fine-tuning the LLMs using examples constructed from various public datasets, enabling AMOR to generalize across different knowledge environments. The adaptation stage tailors AMOR to specific domains by collecting process feedback during autonomous execution and further fine-tuning based on this feedback.
Experiments across multiple domains demonstrate that AMOR outperforms strong baselines, thanks to its FSM-based reasoning and process feedback mechanism. The paper also discusses the effectiveness of process feedback in adaptation compared to outcome feedback, highlighting its superior data efficiency and adaptability. Future work will explore extending AMOR to handle more types of knowledge and broader agent tasks.