Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains

Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains

2024 | Junhong Shen, Neil Tenenholtz, James Brian Hall, David Alvarez-Melis, Nicolò Fusi
This paper introduces TAG-LLM, a novel framework for repurposing general-purpose Large Language Models (LLMs) to solve specialized tasks in domains underrepresented in their pretraining corpora, such as physical and biomedical sciences. The framework introduces the concept of *input tags*, which are continuous vectors appended to the LLM's embedding layer to condition the model. These tags are designed in two types: *domain tags* that delimit specialized representations and provide domain-relevant context, and *function tags* that represent specific functions and compress function-solving instructions. The paper outlines a three-stage training protocol to learn these tags using auxiliary data and domain knowledge, enabling zero-shot generalization to unseen problems through diverse combinations of input tags. The method is evaluated on a diverse set of ten domains, including eight languages and two specialized scientific domains, demonstrating improved performance over expert models tailored for these tasks. The results show that TAG-LLM can effectively leverage existing LLMs to tackle a wide range of real-world problems, with minimal in-domain data and computational cost.This paper introduces TAG-LLM, a novel framework for repurposing general-purpose Large Language Models (LLMs) to solve specialized tasks in domains underrepresented in their pretraining corpora, such as physical and biomedical sciences. The framework introduces the concept of *input tags*, which are continuous vectors appended to the LLM's embedding layer to condition the model. These tags are designed in two types: *domain tags* that delimit specialized representations and provide domain-relevant context, and *function tags* that represent specific functions and compress function-solving instructions. The paper outlines a three-stage training protocol to learn these tags using auxiliary data and domain knowledge, enabling zero-shot generalization to unseen problems through diverse combinations of input tags. The method is evaluated on a diverse set of ten domains, including eight languages and two specialized scientific domains, demonstrating improved performance over expert models tailored for these tasks. The results show that TAG-LLM can effectively leverage existing LLMs to tackle a wide range of real-world problems, with minimal in-domain data and computational cost.
Reach us at info@study.space