This paper presents a method to reduce hallucination in structured outputs using Retrieval-Augmented Generation (RAG) in the context of generating workflows from natural language requirements. The authors describe a system that leverages RAG to improve the quality of structured outputs, particularly in enterprise applications where workflows are represented as JSON documents. The system significantly reduces hallucination and allows the generalization of the LLM to out-of-domain settings. Additionally, the system demonstrates that a small, well-trained retriever can reduce the size of the LLM without loss in performance, making deployments less resource-intensive.
The paper discusses the challenges of using Large Language Models (LLMs) for structured output tasks, such as converting natural language to code or SQL, and how RAG can help mitigate hallucination by incorporating external knowledge. The authors show that using RAG in workflow generation reduces hallucination and improves results. They also demonstrate that a small retriever can be used with a smaller LLM without loss in performance.
The methodology involves training a retriever encoder to align natural language with JSON objects and training an LLM in a RAG fashion by including the retriever's output in its prompt. The retriever is trained using a contrastive loss to minimize the distance between positive and negative pairs. The system is evaluated on both in-domain and out-of-domain datasets, showing that RAG significantly reduces hallucination and improves performance.
The results show that using a retriever reduces hallucination and improves the quality of the generated JSON output. The authors also find that a smaller RAG model performs well, with the 7B parameter model providing the best trade-off between performance and model size. The system is evaluated on various metrics, including Trigger Exact Match, Bag of Steps, and Hallucinated Tables and Steps. The results indicate that RAG significantly reduces hallucination compared to using only an LLM.
The paper also discusses the impact of RAG on engineering, showing that the system can be deployed with a smaller model and that the retriever can be reused for other use cases. The authors conclude that RAG is an effective approach for reducing hallucination in structured output tasks and allow generalization in a structured output task. They also note that while their approach reduces hallucination, it does not eliminate the risk of harm due to hallucination, and their system includes a post-processing layer to indicate generated steps that do not exist.This paper presents a method to reduce hallucination in structured outputs using Retrieval-Augmented Generation (RAG) in the context of generating workflows from natural language requirements. The authors describe a system that leverages RAG to improve the quality of structured outputs, particularly in enterprise applications where workflows are represented as JSON documents. The system significantly reduces hallucination and allows the generalization of the LLM to out-of-domain settings. Additionally, the system demonstrates that a small, well-trained retriever can reduce the size of the LLM without loss in performance, making deployments less resource-intensive.
The paper discusses the challenges of using Large Language Models (LLMs) for structured output tasks, such as converting natural language to code or SQL, and how RAG can help mitigate hallucination by incorporating external knowledge. The authors show that using RAG in workflow generation reduces hallucination and improves results. They also demonstrate that a small retriever can be used with a smaller LLM without loss in performance.
The methodology involves training a retriever encoder to align natural language with JSON objects and training an LLM in a RAG fashion by including the retriever's output in its prompt. The retriever is trained using a contrastive loss to minimize the distance between positive and negative pairs. The system is evaluated on both in-domain and out-of-domain datasets, showing that RAG significantly reduces hallucination and improves performance.
The results show that using a retriever reduces hallucination and improves the quality of the generated JSON output. The authors also find that a smaller RAG model performs well, with the 7B parameter model providing the best trade-off between performance and model size. The system is evaluated on various metrics, including Trigger Exact Match, Bag of Steps, and Hallucinated Tables and Steps. The results indicate that RAG significantly reduces hallucination compared to using only an LLM.
The paper also discusses the impact of RAG on engineering, showing that the system can be deployed with a smaller model and that the retriever can be reused for other use cases. The authors conclude that RAG is an effective approach for reducing hallucination in structured output tasks and allow generalization in a structured output task. They also note that while their approach reduces hallucination, it does not eliminate the risk of harm due to hallucination, and their system includes a post-processing layer to indicate generated steps that do not exist.