[slides] SPML%3A A DSL for Defending Language Models Against Prompt Attacks

The paper introduces System Prompt Meta Language (SPML), a domain-specific language designed to refine prompts and monitor inputs for LLM-based chatbots. SPML actively checks for malicious prompts, ensuring user inputs align with the chatbot's defined scope to prevent unethical applications and financial losses. It also simplifies chatbot definition crafting with programming language capabilities, overcoming challenges in natural language design. The paper presents a benchmark with 1.8k system prompts and 20k user inputs, offering the first benchmark for evaluating chatbot definition evaluation. Experiments demonstrate SPML's proficiency in understanding attacker prompts, outperforming models like GPT-4, GPT-3.5, and LLaMA. The data and code are publicly available at: https://prompt-compiler.github.io/SPML/. The contributions include a novel language for writing system prompts, a benchmark dataset, and a method to monitor and design secure prompts.The paper introduces System Prompt Meta Language (SPML), a domain-specific language designed to refine prompts and monitor inputs for LLM-based chatbots. SPML actively checks for malicious prompts, ensuring user inputs align with the chatbot's defined scope to prevent unethical applications and financial losses. It also simplifies chatbot definition crafting with programming language capabilities, overcoming challenges in natural language design. The paper presents a benchmark with 1.8k system prompts and 20k user inputs, offering the first benchmark for evaluating chatbot definition evaluation. Experiments demonstrate SPML's proficiency in understanding attacker prompts, outperforming models like GPT-4, GPT-3.5, and LLaMA. The data and code are publicly available at: https://prompt-compiler.github.io/SPML/. The contributions include a novel language for writing system prompts, a benchmark dataset, and a method to monitor and design secure prompts.

SPML: A DSL for Defending Language Models Against Prompt Attacks

19 Feb 2024 | Reshabh K Sharma, Vinayak Gupta, Dan Grossman