Understanding Seal-Tools%3A Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark

This paper introduces Seal-Tools, a new tool learning dataset designed to enhance the capabilities of large language models (LLMs) as agents. Seal-Tools includes a large number of self-instruct API-like tools and instances demonstrating practical tool applications. The dataset is constructed using a self-instruct method to generate tools and instances, ensuring reliability and diversity. It features hard instances with nested tool calls, which are valuable for fine-tuning LLMs. The dataset is evaluated using three metrics: Output Format, Tool Selection, and Tool-Parameter Filling-in, providing a comprehensive benchmark for tool-calling ability. The paper evaluates several prevalent LLMs and a finetuned model on Seal-Tools, showing that current systems have room for improvement, especially in nested tool calls. The code, data, and experiment results are available at <https://github.com/fairyshine/Seal-Tools>.This paper introduces Seal-Tools, a new tool learning dataset designed to enhance the capabilities of large language models (LLMs) as agents. Seal-Tools includes a large number of self-instruct API-like tools and instances demonstrating practical tool applications. The dataset is constructed using a self-instruct method to generate tools and instances, ensuring reliability and diversity. It features hard instances with nested tool calls, which are valuable for fine-tuning LLMs. The dataset is evaluated using three metrics: Output Format, Tool Selection, and Tool-Parameter Filling-in, providing a comprehensive benchmark for tool-calling ability. The paper evaluates several prevalent LLMs and a finetuned model on Seal-Tools, showing that current systems have room for improvement, especially in nested tool calls. The code, data, and experiment results are available at <https://github.com/fairyshine/Seal-Tools>.

Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmark

14 May 2024 | Mengsong Wu, Tong Zhu, Xiang Zhang, Han Han, Chuanyuan Tan, Wenliang Chen