[slides] Learning to Compress Prompt in Natural Language Formats

The paper "Learning to Compress Prompt in Natural Language Formats" addresses the challenge of compressing long prompts into shorter, natural language (NL) formats while maintaining their utility and transferability across different large language models (LLMs). The authors propose the Nano-Capsulator framework, which compresses original prompts into *Capsule Prompts* in NL format. The framework aims to tackle two main challenges: the incompatibility of NL prompts with backpropagation and the lack of flexibility in imposing length constraints. To address these challenges, Nano-Capsulator uses a reward function and a semantics-preserving loss to optimize the compression process. Experimental results show that *Capsule Prompts* can reduce the original prompt length by 81.4%, decrease inference latency by up to 4.5 times, and save 80.1% of budget overheads while maintaining transferability across diverse LLMs and datasets. The framework is evaluated on two types of prompts: few-shot demonstration chain-of-thoughts (CoT) and passage prompts for reading comprehension tasks. The results demonstrate the effectiveness of *Capsule Prompts* in preserving performance and utility across different LLMs and unseen datasets.The paper "Learning to Compress Prompt in Natural Language Formats" addresses the challenge of compressing long prompts into shorter, natural language (NL) formats while maintaining their utility and transferability across different large language models (LLMs). The authors propose the Nano-Capsulator framework, which compresses original prompts into *Capsule Prompts* in NL format. The framework aims to tackle two main challenges: the incompatibility of NL prompts with backpropagation and the lack of flexibility in imposing length constraints. To address these challenges, Nano-Capsulator uses a reward function and a semantics-preserving loss to optimize the compression process. Experimental results show that *Capsule Prompts* can reduce the original prompt length by 81.4%, decrease inference latency by up to 4.5 times, and save 80.1% of budget overheads while maintaining transferability across diverse LLMs and datasets. The framework is evaluated on two types of prompts: few-shot demonstration chain-of-thoughts (CoT) and passage prompts for reading comprehension tasks. The results demonstrate the effectiveness of *Capsule Prompts* in preserving performance and utility across different LLMs and unseen datasets.

Learning to Compress Prompt in Natural Language Formats

2 Apr 2024 | Yu-Neng Chuang, Tianwei Xing, Chia-Yuan Chang, Zirui Liu, Xun Chen, Xia Hu