15 Feb 2024 | Lingbo Mo, Zeyi Liao, Boyuan Zheng, Yu Su, Chaowei Xiao, Huan Sun
This paper explores the potential safety risks associated with language agents, which are powered by large language models (LLMs) and have seen rapid development. The authors present a unified conceptual framework for language agents, consisting of three major components: Perception, Brain, and Action. Within this framework, they identify 12 potential attack scenarios across different agent components, covering various attack strategies such as input manipulation, adversarial demonstrations, jailbreaking, and backdoors. The paper emphasizes the urgency to gain a thorough understanding of these risks before widespread deployment of language agents. It also discusses the connections between these attack scenarios and existing successful strategies applied to LLMs, highlighting the need for further research and responsible practices in their development and use. The authors conclude by calling for increased attention to the safety risks of language agents and the importance of addressing these challenges.This paper explores the potential safety risks associated with language agents, which are powered by large language models (LLMs) and have seen rapid development. The authors present a unified conceptual framework for language agents, consisting of three major components: Perception, Brain, and Action. Within this framework, they identify 12 potential attack scenarios across different agent components, covering various attack strategies such as input manipulation, adversarial demonstrations, jailbreaking, and backdoors. The paper emphasizes the urgency to gain a thorough understanding of these risks before widespread deployment of language agents. It also discusses the connections between these attack scenarios and existing successful strategies applied to LLMs, highlighting the need for further research and responsible practices in their development and use. The authors conclude by calling for increased attention to the safety risks of language agents and the importance of addressing these challenges.