StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

StructLM: Towards Building Generalist Models for Structured Knowledge Grounding

24 Apr 2024 | Alex Zhuang, Ge Zhang, Tianyu Zheng, Xinrun Du, Junjie Wang, Weiming Ren, Stephen W. Huang, Jie Fu, Xiang Yue, Wenhui Chen
The paper "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" addresses the limitations of large language models (LLMs) in processing structured data, such as tables, graphs, and databases. The authors construct a comprehensive instruction-tuning dataset with 1.1 million examples and train a series of models, referred to as StructLM, based on Mistral and the CodeLlama model family, ranging from 7B to 34B parameters. StructLM outperforms task-specific models on 16 out of 18 evaluated datasets and achieves state-of-the-art (SoTA) performance on 8 structured knowledge grounding (SKG) tasks. StructLM also demonstrates strong generalization across 6 novel held-out SKG tasks, outperforming TableLlama by an average of 35% and Flan-UL2 20B by an average of 10%. The study finds that scaling model size offers marginal benefits, suggesting that structured knowledge grounding remains a challenging task requiring innovative design. The authors release the model weights and training dataset to the community, along with relevant code.The paper "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" addresses the limitations of large language models (LLMs) in processing structured data, such as tables, graphs, and databases. The authors construct a comprehensive instruction-tuning dataset with 1.1 million examples and train a series of models, referred to as StructLM, based on Mistral and the CodeLlama model family, ranging from 7B to 34B parameters. StructLM outperforms task-specific models on 16 out of 18 evaluated datasets and achieves state-of-the-art (SoTA) performance on 8 structured knowledge grounding (SKG) tasks. StructLM also demonstrates strong generalization across 6 novel held-out SKG tasks, outperforming TableLlama by an average of 35% and Flan-UL2 20B by an average of 10%. The study finds that scaling model size offers marginal benefits, suggesting that structured knowledge grounding remains a challenging task requiring innovative design. The authors release the model weights and training dataset to the community, along with relevant code.
Reach us at info@study.space