Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk

Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk

10 Jan 2024 | Dennis Ulmer, Elman Mansimov, Kaixiang Lin, Justin Sun, Xibin Gao, Yi Zhang
This paper proposes a method for bootstrapping task-oriented dialogue agents using self-talk, where large language models (LLMs) engage in conversations in different roles to generate training data. The approach involves two LLMs, a client and an agent, interacting in a structured dialogue to simulate real-world tasks. The agent follows a specific narrative structure, while the client provides context and motivation. Generated dialogues are filtered based on quality and used for supervised fine-tuning to improve the agent's performance. An automated metric is introduced to evaluate dialogue success and consistency, and human evaluations are conducted to validate the effectiveness of the self-talk data. The method is evaluated using a dataset of character descriptions from the LIGHT dataset, with dialogues generated by a 13 billion parameter OpenLlama variant. The self-talk loop involves a 30 billion parameter MosaicAI chat model as the client and a 7 billion parameter model as the agent. Dialogues are filtered based on subgoal completion and workflow adherence, with results showing that filters selecting dialogues with at least 5 completed subgoals or the top 5% of dialogues in terms of completion yield the best performance. Human evaluations confirm that the best filters significantly improve dialogue success, workflow adherence, and overall dialogue quality. The study demonstrates that self-talk can generate high-quality training data for task-oriented dialogue agents, leading to improved performance. The approach is validated through automated and human evaluations, showing that self-talk data enhances the effectiveness of supervised fine-tuning. The results highlight the importance of filtering and the potential of self-talk in improving dialogue agents. The study also identifies challenges, such as the risk of models generating low-quality dialogues or getting stuck in loops, and suggests future work to address these issues. Overall, the method provides a promising approach for training task-oriented dialogue agents using self-talk.This paper proposes a method for bootstrapping task-oriented dialogue agents using self-talk, where large language models (LLMs) engage in conversations in different roles to generate training data. The approach involves two LLMs, a client and an agent, interacting in a structured dialogue to simulate real-world tasks. The agent follows a specific narrative structure, while the client provides context and motivation. Generated dialogues are filtered based on quality and used for supervised fine-tuning to improve the agent's performance. An automated metric is introduced to evaluate dialogue success and consistency, and human evaluations are conducted to validate the effectiveness of the self-talk data. The method is evaluated using a dataset of character descriptions from the LIGHT dataset, with dialogues generated by a 13 billion parameter OpenLlama variant. The self-talk loop involves a 30 billion parameter MosaicAI chat model as the client and a 7 billion parameter model as the agent. Dialogues are filtered based on subgoal completion and workflow adherence, with results showing that filters selecting dialogues with at least 5 completed subgoals or the top 5% of dialogues in terms of completion yield the best performance. Human evaluations confirm that the best filters significantly improve dialogue success, workflow adherence, and overall dialogue quality. The study demonstrates that self-talk can generate high-quality training data for task-oriented dialogue agents, leading to improved performance. The approach is validated through automated and human evaluations, showing that self-talk data enhances the effectiveness of supervised fine-tuning. The results highlight the importance of filtering and the potential of self-talk in improving dialogue agents. The study also identifies challenges, such as the risk of models generating low-quality dialogues or getting stuck in loops, and suggests future work to address these issues. Overall, the method provides a promising approach for training task-oriented dialogue agents using self-talk.
Reach us at info@study.space
Understanding Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk