Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM

Knowledge-to-SQL: Enhancing SQL Generation with Data Expert LLM

6 Jun 2024 | Zijin Hong, Zheng Yuan, Hao Chen, Qinggang Zhang, Feiran Huang
The paper introduces the Knowledge-to-SQL framework, which enhances SQL generation by incorporating expert knowledge through a Data Expert Large Language Model (DELLM). The framework addresses the challenge of generating accurate SQL queries for user questions by leveraging tailored knowledge from a data expert. The DELLM is designed with a table reading module and a knowledge-oriented supervised fine-tuning process. Additionally, a Preference Learning via Database Feedback (PLDBF) strategy is proposed to refine the generated knowledge for LLM-based text-to-SQL tasks. The framework includes three main modules: a supervised fine-tuned model for generating expert knowledge, a preference learning framework for refining the model based on database execution feedback, and an off-the-shelf text-to-SQL model that uses the generated knowledge to produce SQL queries. The DELLM is trained using a combination of supervised fine-tuning and preference learning. The supervised fine-tuning process involves generating expert knowledge based on user questions and database schemas. The preference learning process refines the model by aligning feedback from database executions with the contributions of ground-truth SQL. The framework is evaluated on the BIRD and Spider datasets, demonstrating that DELLM can enhance the performance of existing text-to-SQL models. The results show that DELLM improves execution accuracy and valid efficiency scores, with significant improvements on both datasets. The framework also shows effectiveness in cross-domain scenarios and is robust to partial training data. The study highlights the importance of expert knowledge in improving the accuracy and efficiency of text-to-SQL models. The proposed framework provides a new approach to generating expert knowledge for text-to-SQL tasks, which can be used to enhance the performance of large language models in this domain.The paper introduces the Knowledge-to-SQL framework, which enhances SQL generation by incorporating expert knowledge through a Data Expert Large Language Model (DELLM). The framework addresses the challenge of generating accurate SQL queries for user questions by leveraging tailored knowledge from a data expert. The DELLM is designed with a table reading module and a knowledge-oriented supervised fine-tuning process. Additionally, a Preference Learning via Database Feedback (PLDBF) strategy is proposed to refine the generated knowledge for LLM-based text-to-SQL tasks. The framework includes three main modules: a supervised fine-tuned model for generating expert knowledge, a preference learning framework for refining the model based on database execution feedback, and an off-the-shelf text-to-SQL model that uses the generated knowledge to produce SQL queries. The DELLM is trained using a combination of supervised fine-tuning and preference learning. The supervised fine-tuning process involves generating expert knowledge based on user questions and database schemas. The preference learning process refines the model by aligning feedback from database executions with the contributions of ground-truth SQL. The framework is evaluated on the BIRD and Spider datasets, demonstrating that DELLM can enhance the performance of existing text-to-SQL models. The results show that DELLM improves execution accuracy and valid efficiency scores, with significant improvements on both datasets. The framework also shows effectiveness in cross-domain scenarios and is robust to partial training data. The study highlights the importance of expert knowledge in improving the accuracy and efficiency of text-to-SQL models. The proposed framework provides a new approach to generating expert knowledge for text-to-SQL tasks, which can be used to enhance the performance of large language models in this domain.
Reach us at info@study.space