[slides and audio] Large Language Model for Table Processing%3A A Survey

This survey provides a comprehensive overview of table-related tasks and the techniques used to automate these tasks using Large Language Models (LLMs) and Visual Language Models (VLMs). It covers traditional tasks like table question answering (QA) and emerging fields such as spreadsheet manipulation and table data analysis. The survey discusses training techniques for LLMs and VLMs tailored for table processing, including instruction-tuning and the use of LLM-powered agents. It highlights challenges such as processing implicit user intentions and extracting information from various table sources. The survey also categorizes methods based on the latest paradigms in LLM usage and provides resources such as papers, code, and datasets. The introduction emphasizes the importance of automating table-related tasks for broader accessibility and the unique characteristics of tables that differ from plain text, such as structured data and complex reasoning. The survey explores different types of tables, including spreadsheets, web tables, databases, and documents, and their respective formats and tasks. It discusses the differences between tables and text, the data lifecycle, and the representation of table data for LLMs and VLMs. The training techniques for LLMs and VLMs are detailed, including pre-LLM era methods, table LLM training, and table VLM training. The survey also covers prompting strategies for LLMs to handle complex tasks, including planning, action, and reflection mechanisms. Finally, it discusses the limitations and challenges of LLM-powered agents, such as limited transferability, cost, and privacy issues.This survey provides a comprehensive overview of table-related tasks and the techniques used to automate these tasks using Large Language Models (LLMs) and Visual Language Models (VLMs). It covers traditional tasks like table question answering (QA) and emerging fields such as spreadsheet manipulation and table data analysis. The survey discusses training techniques for LLMs and VLMs tailored for table processing, including instruction-tuning and the use of LLM-powered agents. It highlights challenges such as processing implicit user intentions and extracting information from various table sources. The survey also categorizes methods based on the latest paradigms in LLM usage and provides resources such as papers, code, and datasets. The introduction emphasizes the importance of automating table-related tasks for broader accessibility and the unique characteristics of tables that differ from plain text, such as structured data and complex reasoning. The survey explores different types of tables, including spreadsheets, web tables, databases, and documents, and their respective formats and tasks. It discusses the differences between tables and text, the data lifecycle, and the representation of table data for LLMs and VLMs. The training techniques for LLMs and VLMs are detailed, including pre-LLM era methods, table LLM training, and table VLM training. The survey also covers prompting strategies for LLMs to handle complex tasks, including planning, action, and reflection mechanisms. Finally, it discusses the limitations and challenges of LLM-powered agents, such as limited transferability, cost, and privacy issues.

Large Language Model for Table Processing: A Survey

2024 | Weizheng Lu, Jing Zhang, Ju Fan, Zihao Fu, Yueguo Chen, Xiaoyong Du