[slides] garak%3A A Framework for Security Probing Large Language Models

**garak: A Framework for Security Probing Large Language Models** As large language models (LLMs) are increasingly integrated into various applications, the need for scalable evaluation of their security against adversarial attacks grows. However, LLM security is a dynamic field due to the unpredictable nature of LLM outputs, constant updates, and diverse potential adversaries. This paper introduces garak, a framework designed to explore and discover vulnerabilities in LLMs and dialog systems. garak is structured around four components: Generators, Probes, Detectors, and Buffs, which work together to probe an LLM's responses and identify potential issues. The framework is flexible and can be customized for different security evaluation procedures. It aims to provide a systematic approach to exploring and identifying vulnerabilities, contributing to informed discussions on alignment and policy formation for LLM deployment. The paper also discusses the background and related work, including red teaming, vulnerability detection, and testing LLM systems. garak is open-source and designed to facilitate better-informed decisions and more robust artifacts in LLM security evaluation.**garak: A Framework for Security Probing Large Language Models** As large language models (LLMs) are increasingly integrated into various applications, the need for scalable evaluation of their security against adversarial attacks grows. However, LLM security is a dynamic field due to the unpredictable nature of LLM outputs, constant updates, and diverse potential adversaries. This paper introduces garak, a framework designed to explore and discover vulnerabilities in LLMs and dialog systems. garak is structured around four components: Generators, Probes, Detectors, and Buffs, which work together to probe an LLM's responses and identify potential issues. The framework is flexible and can be customized for different security evaluation procedures. It aims to provide a systematic approach to exploring and identifying vulnerabilities, contributing to informed discussions on alignment and policy formation for LLM deployment. The paper also discusses the background and related work, including red teaming, vulnerability detection, and testing LLM systems. garak is open-source and designed to facilitate better-informed decisions and more robust artifacts in LLM security evaluation.

garak: A Framework for Security Probing Large Language Models

16 Jun 2024 | Leon Derczynski, Erick Galinkin, Jeffrey Martin, Subho Majumdar, Nanna Inie