16 Feb 2024 | Richard Fang, Rohan Bindu, Akul Gupta, Qiusi Zhan, Daniel Kang
This paper explores the capabilities of large language models (LLMs) in autonomously hacking websites. The authors demonstrate that LLM agents can perform complex tasks such as blind database schema extraction and SQL injections without prior knowledge of the vulnerabilities. They find that GPT-4, a frontier model, is capable of these hacks, while existing open-source models are not. The study also shows that GPT-4 can autonomously find vulnerabilities in real-world websites. The authors analyze the cost of these hacks, finding it to be significantly lower than the cost of human effort. The paper raises concerns about the widespread deployment of LLMs and highlights the need for responsible release policies. The findings suggest that LLMs have offensive capabilities in cybersecurity, which could have significant implications for both ethical and legal considerations.This paper explores the capabilities of large language models (LLMs) in autonomously hacking websites. The authors demonstrate that LLM agents can perform complex tasks such as blind database schema extraction and SQL injections without prior knowledge of the vulnerabilities. They find that GPT-4, a frontier model, is capable of these hacks, while existing open-source models are not. The study also shows that GPT-4 can autonomously find vulnerabilities in real-world websites. The authors analyze the cost of these hacks, finding it to be significantly lower than the cost of human effort. The paper raises concerns about the widespread deployment of LLMs and highlights the need for responsible release policies. The findings suggest that LLMs have offensive capabilities in cybersecurity, which could have significant implications for both ethical and legal considerations.