k-SEMSTAMP: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text

k-SEMSTAMP: A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text

8 Jun 2024 | Abe Bohan Hou, Jingyu Zhang, Yichen Wang, Daniel Khashabi, Tianxing He
k-SEMSTAMP is a clustering-based semantic watermarking method designed to detect machine-generated text. It improves upon SEMSTAMP, which uses locality-sensitive hashing (LSH) to partition semantic space, by replacing LSH with k-means clustering to better capture semantic structure. This approach enhances robustness against paraphrase attacks and improves sampling efficiency while maintaining generation quality. k-SEMSTAMP partitions the semantic space into clusters based on the text domain, ensuring that sentences fall into valid regions for watermark detection. The method uses a cluster margin constraint to prevent paraphrased sentences from being assigned to nearby clusters, increasing robustness. Experimental results show that k-SEMSTAMP outperforms SEMSTAMP and other baselines in detection robustness across various paraphrase attacks and maintains high generation quality. It also demonstrates better sampling efficiency, requiring fewer samples to accept a valid sentence. However, k-SEMSTAMP requires specifying the text domain for initialization, which may affect performance if the domain is not well-suited. Despite these limitations, k-SEMSTAMP offers a more effective tool for detecting machine-generated text, contributing to the development of robust adversarial-robust methods for AI governance.k-SEMSTAMP is a clustering-based semantic watermarking method designed to detect machine-generated text. It improves upon SEMSTAMP, which uses locality-sensitive hashing (LSH) to partition semantic space, by replacing LSH with k-means clustering to better capture semantic structure. This approach enhances robustness against paraphrase attacks and improves sampling efficiency while maintaining generation quality. k-SEMSTAMP partitions the semantic space into clusters based on the text domain, ensuring that sentences fall into valid regions for watermark detection. The method uses a cluster margin constraint to prevent paraphrased sentences from being assigned to nearby clusters, increasing robustness. Experimental results show that k-SEMSTAMP outperforms SEMSTAMP and other baselines in detection robustness across various paraphrase attacks and maintains high generation quality. It also demonstrates better sampling efficiency, requiring fewer samples to accept a valid sentence. However, k-SEMSTAMP requires specifying the text domain for initialization, which may affect performance if the domain is not well-suited. Despite these limitations, k-SEMSTAMP offers a more effective tool for detecting machine-generated text, contributing to the development of robust adversarial-robust methods for AI governance.
Reach us at info@study.space
Understanding k-SemStamp%3A A Clustering-Based Semantic Watermark for Detection of Machine-Generated Text