Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems

11 Jan 2024 | Tianyu Cui1*, Yanling Wang1*, Chuanpu Fu2, Yong Xiao1, Sijia Li3, Xinshao Deng2, Yunpeng Liu2, Qinglin Zhang2, Ziyi Qiu2, Peiyang Li2, Zhixing Tan1, Junwu Xiong4, Xinyu Kong4, Zujie Wen4, Ke Xu12†, Qi Li12†
This paper addresses the safety and security issues of Large Language Models (LLMs) by proposing a comprehensive taxonomy and reviewing existing benchmarks. LLMs, with their advanced capabilities in natural language processing, have become widely used but face significant challenges in terms of safety and security. The authors identify four key modules of an LLM system: input, language model, toolchain, and output. They propose a module-oriented risk taxonomy to systematically analyze potential risks associated with each module and discuss corresponding mitigation strategies. The taxonomy covers 12 specific risks and 44 sub-risk topics, providing a structured approach to understanding and addressing these risks. Additionally, the paper reviews prevalent benchmarks for evaluating the safety and security of LLM systems, aiming to facilitate the development of more responsible and reliable LLMs. The contributions of the paper include a comprehensive survey of risks and mitigation methods, a detailed taxonomy for risk categorization, and a review of benchmarks for evaluation. The authors hope that this work will help LLM participants adopt a systematic perspective to build safer and more beneficial LLM systems.This paper addresses the safety and security issues of Large Language Models (LLMs) by proposing a comprehensive taxonomy and reviewing existing benchmarks. LLMs, with their advanced capabilities in natural language processing, have become widely used but face significant challenges in terms of safety and security. The authors identify four key modules of an LLM system: input, language model, toolchain, and output. They propose a module-oriented risk taxonomy to systematically analyze potential risks associated with each module and discuss corresponding mitigation strategies. The taxonomy covers 12 specific risks and 44 sub-risk topics, providing a structured approach to understanding and addressing these risks. Additionally, the paper reviews prevalent benchmarks for evaluating the safety and security of LLM systems, aiming to facilitate the development of more responsible and reliable LLMs. The contributions of the paper include a comprehensive survey of risks and mitigation methods, a detailed taxonomy for risk categorization, and a review of benchmarks for evaluation. The authors hope that this work will help LLM participants adopt a systematic perspective to build safer and more beneficial LLM systems.
Reach us at info@study.space
Understanding Risk Taxonomy%2C Mitigation%2C and Assessment Benchmarks of Large Language Model Systems