[slides and audio] Spotting LLMs With Binoculars%3A Zero-Shot Detection of Machine-Generated Text

The paper introduces Binoculars, a novel zero-shot detector for identifying machine-generated text produced by large language models (LLMs). Binoculars uses a contrast between two pre-trained LLMs to measure perplexity and cross-perplexity, which helps distinguish between human and machine-generated text. This method achieves high accuracy without requiring training data or model-specific modifications. Binoculars is effective in detecting text generated by various LLMs, including ChatGPT, with a low false positive rate. The detector is evaluated across multiple text sources and scenarios, demonstrating its robustness in detecting machine-generated text in diverse contexts. The paper also discusses challenges in detecting LLM-generated text, such as the "capybara problem," where prompts can influence the perceived surprise of generated text. Binoculars addresses this by normalizing perplexity measurements relative to expected baseline values. The method is compared against existing detectors, showing superior performance in several cases. The study highlights the importance of reliable detection methods in combating issues like academic plagiarism, misinformation, and fake reviews. Binoculars is shown to be effective in detecting text across multiple languages and domains, including non-native English writing. The paper also discusses limitations, such as the need for further research on non-conversational text domains and the potential for adversarial attacks. Overall, Binoculars represents a significant advancement in zero-shot detection of machine-generated text.The paper introduces Binoculars, a novel zero-shot detector for identifying machine-generated text produced by large language models (LLMs). Binoculars uses a contrast between two pre-trained LLMs to measure perplexity and cross-perplexity, which helps distinguish between human and machine-generated text. This method achieves high accuracy without requiring training data or model-specific modifications. Binoculars is effective in detecting text generated by various LLMs, including ChatGPT, with a low false positive rate. The detector is evaluated across multiple text sources and scenarios, demonstrating its robustness in detecting machine-generated text in diverse contexts. The paper also discusses challenges in detecting LLM-generated text, such as the "capybara problem," where prompts can influence the perceived surprise of generated text. Binoculars addresses this by normalizing perplexity measurements relative to expected baseline values. The method is compared against existing detectors, showing superior performance in several cases. The study highlights the importance of reliable detection methods in combating issues like academic plagiarism, misinformation, and fake reviews. Binoculars is shown to be effective in detecting text across multiple languages and domains, including non-native English writing. The paper also discusses limitations, such as the need for further research on non-conversational text domains and the potential for adversarial attacks. Overall, Binoculars represents a significant advancement in zero-shot detection of machine-generated text.

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

2024 | Abhimanyu Hans, Avi Schwarzschild, Valeria Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Geiping, Tom Goldstein