Scalable watermarking for identifying large language model outputs

Scalable watermarking for identifying large language model outputs

24 October 2024 | Sumanth Dathathri, Abigail See, Sumedh Ghaisas, Po-Sen Huang, Rob McAdam, Johannes Welbl, Vandana Bachani, Alex Kaskasoli, Robert Stanforth, Tatiana Matejovicova, Jamie Hayes, Nidhi Vyas, Majd Al Meray, Jonah Brown-Cohen, Rudy Bunel, Borja Balle, Taylan Cemgil, Zahra Ahmed, Kitty Stacpoole, Ilia Shumailov, Ciprian Baetu, Sven Gowa, Demis Hassabis & Pushmeet Kohli
This paper introduces SynthID-Text, a production-ready text watermarking scheme for large language models (LLMs). The method enables high detection accuracy with minimal latency overhead, preserving text quality and allowing for efficient watermark detection without requiring access to the underlying LLM. SynthID-Text integrates watermarking with speculative sampling, an efficiency technique used in production systems, to enable watermarking at scale. Evaluations across multiple LLMs show that SynthID-Text provides improved detectability compared to comparable methods, with no change in LLM capabilities as measured by standard benchmarks and human ratings. A live experiment with nearly 20 million Gemini responses confirmed the preservation of text quality. The method is non-distortionary, preserving text quality, or distortionary, improving detectability at the cost of text quality. SynthID-Text has been used to watermark Gemini and Gemini Advanced, demonstrating its feasibility in large-scale production systems. The method also provides a way to combine generative watermarking with speculative sampling, enabling efficient deployment of watermarked LLMs. The paper also discusses the limitations of generative watermarks, including their vulnerability to attacks and the need for coordination between actors running LLM text generation services. Overall, the work provides proof of the real-world viability of generative text watermarks and sets a practical milestone for accountable, transparent, and responsible LLM deployment.This paper introduces SynthID-Text, a production-ready text watermarking scheme for large language models (LLMs). The method enables high detection accuracy with minimal latency overhead, preserving text quality and allowing for efficient watermark detection without requiring access to the underlying LLM. SynthID-Text integrates watermarking with speculative sampling, an efficiency technique used in production systems, to enable watermarking at scale. Evaluations across multiple LLMs show that SynthID-Text provides improved detectability compared to comparable methods, with no change in LLM capabilities as measured by standard benchmarks and human ratings. A live experiment with nearly 20 million Gemini responses confirmed the preservation of text quality. The method is non-distortionary, preserving text quality, or distortionary, improving detectability at the cost of text quality. SynthID-Text has been used to watermark Gemini and Gemini Advanced, demonstrating its feasibility in large-scale production systems. The method also provides a way to combine generative watermarking with speculative sampling, enabling efficient deployment of watermarked LLMs. The paper also discusses the limitations of generative watermarks, including their vulnerability to attacks and the need for coordination between actors running LLM text generation services. Overall, the work provides proof of the real-world viability of generative text watermarks and sets a practical milestone for accountable, transparent, and responsible LLM deployment.
Reach us at info@study.space
[slides and audio] Scalable watermarking for identifying large language model outputs