30 May 2024 | Chaofan Lin, Zhenhua Han, Chengruidong Zhang, Yuqing Yang, Fan Yang, Chen Chen, Lili Qiu
Parrot is an LLM service system designed to enhance the end-to-end performance of LLM-based applications. It introduces *Semantic Variable*, a unified abstraction that exposes application-level knowledge to public LLM services. Semantic Variables annotate input/output variables in prompts, creating a data pipeline for connecting multiple LLM requests. This allows LLM services to perform conventional data flow analysis, uncovering correlations between requests and enabling new optimization opportunities. Parrot's scheduling policy leverages application-level knowledge to optimize end-to-end performance, addressing issues such as excessive overhead of consecutive requests, misaligned scheduling objectives, and redundant computations. Extensive evaluations demonstrate that Parrot can achieve up to an order-of-magnitude improvement in popular and practical use cases of LLM applications.Parrot is an LLM service system designed to enhance the end-to-end performance of LLM-based applications. It introduces *Semantic Variable*, a unified abstraction that exposes application-level knowledge to public LLM services. Semantic Variables annotate input/output variables in prompts, creating a data pipeline for connecting multiple LLM requests. This allows LLM services to perform conventional data flow analysis, uncovering correlations between requests and enabling new optimization opportunities. Parrot's scheduling policy leverages application-level knowledge to optimize end-to-end performance, addressing issues such as excessive overhead of consecutive requests, misaligned scheduling objectives, and redundant computations. Extensive evaluations demonstrate that Parrot can achieve up to an order-of-magnitude improvement in popular and practical use cases of LLM applications.