6 Mar 2024 | Bin Zhang, Yuxiao Ye, Guoqing Du, Xiaoru Hu, Zhishuai Li, Sun Yang, Chi Harold Liu, Rui Zhao, Ziyue Li, Hangyu Mao
This paper addresses the challenges in benchmarking Large Language Models (LLMs) for the Text-to-SQL task, which involves transforming natural language questions into structured SQL statements. The authors construct a new dataset to mitigate overfitting risks and formulate five evaluation tasks—Text-to-SQL, SQL Debugging, SQL Optimization, Schema Linking, and SQL-to-Text—to comprehensively assess LLMs' performance. They identify optimal prompt templates and in-context learning strategies for each task, highlighting performance disparities among LLMs. The study provides valuable insights for developing more effective LLM-based Text-to-SQL systems, emphasizing the importance of careful model selection and prompt engineering. Key findings include the effectiveness of specific prompt templates, the superior performance of coding-specific models, and the need for detailed error information in self-debugging. The research also explores the potential of LLMs in SQL optimization and schema linking, contributing to the advancement of Text-to-SQL systems.This paper addresses the challenges in benchmarking Large Language Models (LLMs) for the Text-to-SQL task, which involves transforming natural language questions into structured SQL statements. The authors construct a new dataset to mitigate overfitting risks and formulate five evaluation tasks—Text-to-SQL, SQL Debugging, SQL Optimization, Schema Linking, and SQL-to-Text—to comprehensively assess LLMs' performance. They identify optimal prompt templates and in-context learning strategies for each task, highlighting performance disparities among LLMs. The study provides valuable insights for developing more effective LLM-based Text-to-SQL systems, emphasizing the importance of careful model selection and prompt engineering. Key findings include the effectiveness of specific prompt templates, the superior performance of coding-specific models, and the need for detailed error information in self-debugging. The research also explores the potential of LLMs in SQL optimization and schema linking, contributing to the advancement of Text-to-SQL systems.