Understanding Increasing the LLM Accuracy for Question Answering%3A Ontologies to the Rescue!

The paper "Increasing the LLM Accuracy for Question Answering: Ontologies to the Rescue!" by Dean Allemang and Juan F. Sequeda from data.world AI Lab explores methods to enhance the accuracy of Large Language Models (LLMs) in question-answering systems, particularly when using a knowledge graph to represent an enterprise SQL database. The authors build on their previous research, which showed that incorporating a knowledge graph improved accuracy from 16% to 54%. They propose an approach that includes: 1. **Ontology-based Query Check (OBQC)**: This method checks if the LLM-generated SPARQL query matches the semantic of the ontology, identifying and explaining errors. 2. **LLM Repair**: Using the error explanations, the LLM is prompted to repair the SPARQL query. The experimental results, based on the chat with the data benchmark, show that the combined approach increases overall accuracy to 72%, with an additional 8% of "I don't know" results. The error rate is 20%. The study also reveals that the ontology-based checks were invoked 70% of the time, while the SELECT clause checks were invoked 30% of the time. The findings provide strong evidence that investing in semantics, ontologies, and knowledge graphs can significantly improve the accuracy of LLM-powered question-answering systems.The paper "Increasing the LLM Accuracy for Question Answering: Ontologies to the Rescue!" by Dean Allemang and Juan F. Sequeda from data.world AI Lab explores methods to enhance the accuracy of Large Language Models (LLMs) in question-answering systems, particularly when using a knowledge graph to represent an enterprise SQL database. The authors build on their previous research, which showed that incorporating a knowledge graph improved accuracy from 16% to 54%. They propose an approach that includes: 1. **Ontology-based Query Check (OBQC)**: This method checks if the LLM-generated SPARQL query matches the semantic of the ontology, identifying and explaining errors. 2. **LLM Repair**: Using the error explanations, the LLM is prompted to repair the SPARQL query. The experimental results, based on the chat with the data benchmark, show that the combined approach increases overall accuracy to 72%, with an additional 8% of "I don't know" results. The error rate is 20%. The study also reveals that the ontology-based checks were invoked 70% of the time, while the SELECT clause checks were invoked 30% of the time. The findings provide strong evidence that investing in semantics, ontologies, and knowledge graphs can significantly improve the accuracy of LLM-powered question-answering systems.

INCREASING THE LLM ACCURACY FOR QUESTION ANSWERING: ONTOLOGIES TO THE RESCUE!

May 21, 2024 | Dean Allemand, Juan F. Sequeda