[slides] What Was Your Prompt%3F A Remote Keylogging Attack on AI Assistants

This paper presents a novel side-channel attack on AI assistants, exploiting the token-length side-channel. AI assistants, such as ChatGPT and Copilot, generate responses as sequences of tokens, with each token transmitted as it is generated. While the responses are encrypted, the length of each token can be inferred from the size of the packets, revealing sensitive information. The authors demonstrate how this side-channel can be used to infer the content of AI assistant responses by leveraging large language models (LLMs) to translate token-length sequences into legible sentences. The attack involves three key steps: (1) extracting the token-length sequence from encrypted traffic, (2) using LLMs to infer the original text from the token-length sequence, and (3) refining the inference using known-plaintext attacks by training the model on example responses from the target AI assistant. The authors show that this method can accurately reconstruct 29% of AI assistant responses and infer the topic from 55% of them. The attack was tested on OpenAI's ChatGPT-4 and Microsoft's Copilot, both in browser and API traffic. The results show that the attack is effective in reconstructing responses and inferring their topics, even when the responses are encrypted. The authors also highlight the importance of securing AI assistant communications to prevent the exposure of sensitive information. The paper provides a comprehensive framework for understanding and mitigating the risks associated with the token-length side-channel.This paper presents a novel side-channel attack on AI assistants, exploiting the token-length side-channel. AI assistants, such as ChatGPT and Copilot, generate responses as sequences of tokens, with each token transmitted as it is generated. While the responses are encrypted, the length of each token can be inferred from the size of the packets, revealing sensitive information. The authors demonstrate how this side-channel can be used to infer the content of AI assistant responses by leveraging large language models (LLMs) to translate token-length sequences into legible sentences. The attack involves three key steps: (1) extracting the token-length sequence from encrypted traffic, (2) using LLMs to infer the original text from the token-length sequence, and (3) refining the inference using known-plaintext attacks by training the model on example responses from the target AI assistant. The authors show that this method can accurately reconstruct 29% of AI assistant responses and infer the topic from 55% of them. The attack was tested on OpenAI's ChatGPT-4 and Microsoft's Copilot, both in browser and API traffic. The results show that the attack is effective in reconstructing responses and inferring their topics, even when the responses are encrypted. The authors also highlight the importance of securing AI assistant communications to prevent the exposure of sensitive information. The paper provides a comprehensive framework for understanding and mitigating the risks associated with the token-length side-channel.

What Was Your Prompt? A Remote Keylogging Attack on AI Assistants

14 Mar 2024 | Roy Weiss, Daniel Ayzenshteyn, Guy Amit, Yisroel Mirsky