OPEN TAB: ADVANCING LARGE LANGUAGE MODELS AS OPEN-DOMAIN TABLE REASONERS

OPEN TAB: ADVANCING LARGE LANGUAGE MODELS AS OPEN-DOMAIN TABLE REASONERS

2024 | Kezhi Kong, Jian Zhang, Zhengyuan Shen, Balasubramaniam Srinivasan, Chuan Lei, Christos Faloutsos, Huzefa Rangwala, George Karypis
OPENTAB is an open-domain table reasoning framework that leverages large language models (LLMs) to answer questions based on structured table data. The framework consists of three main components: a RETRIEVER, a CODER, and a READER. The RETRIEVER uses BM25 to fetch relevant tables from a corpus, the CODER generates SQL programs to parse the retrieved tables, and the READER uses LLMs to formulate the final response based on the SQL execution results. OPENTAB addresses the challenge of open-domain table reasoning by using a Generative Reranking & Sequential Reasoning (GRSR) strategy to prioritize tables with higher similarity between the natural language query and the corresponding generated SQL programs. This approach improves the accuracy of the system by reducing the impact of hallucinations and ensuring that the correct tables are selected for reasoning. Experimental results show that OPENTAB significantly outperforms baselines in both open- and closed-domain settings, achieving up to 21.5% higher accuracy. The framework is scalable, robust, and effective in handling large-scale tabular data. The system is implemented as an open-source project, making it accessible for further research and application.OPENTAB is an open-domain table reasoning framework that leverages large language models (LLMs) to answer questions based on structured table data. The framework consists of three main components: a RETRIEVER, a CODER, and a READER. The RETRIEVER uses BM25 to fetch relevant tables from a corpus, the CODER generates SQL programs to parse the retrieved tables, and the READER uses LLMs to formulate the final response based on the SQL execution results. OPENTAB addresses the challenge of open-domain table reasoning by using a Generative Reranking & Sequential Reasoning (GRSR) strategy to prioritize tables with higher similarity between the natural language query and the corresponding generated SQL programs. This approach improves the accuracy of the system by reducing the impact of hallucinations and ensuring that the correct tables are selected for reasoning. Experimental results show that OPENTAB significantly outperforms baselines in both open- and closed-domain settings, achieving up to 21.5% higher accuracy. The framework is scalable, robust, and effective in handling large-scale tabular data. The system is implemented as an open-source project, making it accessible for further research and application.
Reach us at info@study.space
Understanding OpenTab%3A Advancing Large Language Models as Open-domain Table Reasoners