04/2024 | Dario Pasquini, Martin Strohmeier, Carmela Troncoso
This paper introduces a new family of prompt injection attacks called Neural Exec. Unlike traditional attacks that rely on manually crafted strings, Neural Exec uses an optimization-based approach to automatically generate effective execution triggers. These triggers are more effective and flexible than existing methods, and can bypass existing detection mechanisms. The paper demonstrates that Neural Exec triggers can persist through complex preprocessing pipelines like Retrieval-Augmented Generation (RAG), making them more effective in real-world applications. The authors also show that their method can be used to discover new exploitable patterns in LLM input spaces. The paper presents results for multiple open-source LLMs, including Mixtral-8x7B and Llama-3, showing that Neural Exec triggers achieve high attack success rates. The paper also discusses the robustness of Neural Exec triggers against RAG pipelines and their effectiveness in indirect prompt injection attacks. The authors conclude that their method provides a powerful tool for both attackers and researchers to understand and exploit LLM vulnerabilities.This paper introduces a new family of prompt injection attacks called Neural Exec. Unlike traditional attacks that rely on manually crafted strings, Neural Exec uses an optimization-based approach to automatically generate effective execution triggers. These triggers are more effective and flexible than existing methods, and can bypass existing detection mechanisms. The paper demonstrates that Neural Exec triggers can persist through complex preprocessing pipelines like Retrieval-Augmented Generation (RAG), making them more effective in real-world applications. The authors also show that their method can be used to discover new exploitable patterns in LLM input spaces. The paper presents results for multiple open-source LLMs, including Mixtral-8x7B and Llama-3, showing that Neural Exec triggers achieve high attack success rates. The paper also discusses the robustness of Neural Exec triggers against RAG pipelines and their effectiveness in indirect prompt injection attacks. The authors conclude that their method provides a powerful tool for both attackers and researchers to understand and exploit LLM vulnerabilities.