WaterMax is a novel watermarking scheme for Large Language Models (LLMs) that achieves high detectability without compromising the quality of the generated text. Unlike existing methods, WaterMax does not modify the LLM's weights, logits, temperature, or sampling technique. It balances robustness and complexity, avoiding the trade-off between quality and robustness inherent in other watermarking techniques. The scheme is theoretically proven and experimentally validated, outperforming state-of-the-art methods on the most comprehensive benchmark suite.
WaterMax works by generating multiple texts for a given prompt and selecting the one with the lowest p-value, which indicates the presence of a watermark. This approach ensures high text quality while maintaining detectability. The scheme utilizes a theoretical model to characterize watermark performance, including false positive and true positive rates, even under attack.
The method is robust against various text editing operations, including insertion and paraphrasing. It also addresses the limitations of existing methods, such as token entropy constraints, quality degradation, and text size limitations. WaterMax can theoretically achieve arbitrary watermarking power without quality loss, even for short texts, by minimizing the p-value of a full text for a watermark detector.
The scheme is efficient, with a computational cost that can be limited through parallelization on modern GPUs. It also demonstrates high robustness against attacks, with a detector that is less affected by token insertion or removal. The method is validated through extensive experiments, showing that WaterMax maintains high detectability and quality even under various attack scenarios.
WaterMax's design is independent of the text length and is not affected by the size of the text. It is also robust to attacks, with a detector that can handle modifications to the text. The method is evaluated using the Mark My Words benchmark, where it outperforms other state-of-the-art methods in terms of detectability and robustness. The results show that WaterMax achieves high detectability with minimal quality loss, making it an effective solution for watermarking LLMs.WaterMax is a novel watermarking scheme for Large Language Models (LLMs) that achieves high detectability without compromising the quality of the generated text. Unlike existing methods, WaterMax does not modify the LLM's weights, logits, temperature, or sampling technique. It balances robustness and complexity, avoiding the trade-off between quality and robustness inherent in other watermarking techniques. The scheme is theoretically proven and experimentally validated, outperforming state-of-the-art methods on the most comprehensive benchmark suite.
WaterMax works by generating multiple texts for a given prompt and selecting the one with the lowest p-value, which indicates the presence of a watermark. This approach ensures high text quality while maintaining detectability. The scheme utilizes a theoretical model to characterize watermark performance, including false positive and true positive rates, even under attack.
The method is robust against various text editing operations, including insertion and paraphrasing. It also addresses the limitations of existing methods, such as token entropy constraints, quality degradation, and text size limitations. WaterMax can theoretically achieve arbitrary watermarking power without quality loss, even for short texts, by minimizing the p-value of a full text for a watermark detector.
The scheme is efficient, with a computational cost that can be limited through parallelization on modern GPUs. It also demonstrates high robustness against attacks, with a detector that is less affected by token insertion or removal. The method is validated through extensive experiments, showing that WaterMax maintains high detectability and quality even under various attack scenarios.
WaterMax's design is independent of the text length and is not affected by the size of the text. It is also robust to attacks, with a detector that can handle modifications to the text. The method is evaluated using the Mark My Words benchmark, where it outperforms other state-of-the-art methods in terms of detectability and robustness. The results show that WaterMax achieves high detectability with minimal quality loss, making it an effective solution for watermarking LLMs.