Octopus v2: On-device language model for super agent

Octopus v2: On-device language model for super agent

16 Apr 2024 | Wei Chen, Zhiyuan Li
The paper introduces Octopus v2, an on-device language model designed to enhance function calling capabilities, particularly for AI agents. The model, with 2 billion parameters, outperforms GPT-4 in both accuracy and latency while reducing context length by 95%. Compared to Llama-7B with a RAG-based function calling mechanism, Octopus v2 improves latency by 35 times. The method involves tokenizing function names and fine-tuning the model with functional tokens, allowing the model to understand software application capabilities and map function descriptions to specific tokens. This approach reduces the number of tokens required for function name prediction, enhancing accuracy and reducing latency. The paper also details the dataset collection process, model training methods, and experimental results, demonstrating the model's effectiveness in various applications, including Android and vehicle functions. The Octopus model's performance is evaluated against GPT-4, GPT-3.5, and other models, showing superior accuracy and lower latency. The paper concludes with discussions on future work, including the development of a model for on-device reasoning and the potential for broader deployment across cloud and local environments.The paper introduces Octopus v2, an on-device language model designed to enhance function calling capabilities, particularly for AI agents. The model, with 2 billion parameters, outperforms GPT-4 in both accuracy and latency while reducing context length by 95%. Compared to Llama-7B with a RAG-based function calling mechanism, Octopus v2 improves latency by 35 times. The method involves tokenizing function names and fine-tuning the model with functional tokens, allowing the model to understand software application capabilities and map function descriptions to specific tokens. This approach reduces the number of tokens required for function name prediction, enhancing accuracy and reducing latency. The paper also details the dataset collection process, model training methods, and experimental results, demonstrating the model's effectiveness in various applications, including Android and vehicle functions. The Octopus model's performance is evaluated against GPT-4, GPT-3.5, and other models, showing superior accuracy and lower latency. The paper concludes with discussions on future work, including the development of a model for on-device reasoning and the potential for broader deployment across cloud and local environments.
Reach us at info@study.space
[slides and audio] Octopus v2%3A On-device language model for super agent