RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors

RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors

10 Jun 2024 | Liam Dugan, Alyssa Hwang, Filip Trhlik, Josh Magnus Ludan, Andrew Zhu, Hainiu Xu, Daphne Ippolito, Chris Callison-Burch
The paper introduces RAID, a comprehensive benchmark dataset for evaluating machine-generated text detectors. RAID includes over 6 million generations from 11 models, 8 domains, 11 adversarial attacks, and 4 decoding strategies. The authors evaluate 12 detectors (8 open-source and 4 closed-source) using RAID and find that current detectors are easily fooled by adversarial attacks, variations in sampling strategies, repetition penalties, and unseen generative models. They release the dataset and a leaderboard to encourage future research and improve the robustness of detectors. The study highlights the need for more robust and general-purpose detectors to address the growing threat of machine-generated content.The paper introduces RAID, a comprehensive benchmark dataset for evaluating machine-generated text detectors. RAID includes over 6 million generations from 11 models, 8 domains, 11 adversarial attacks, and 4 decoding strategies. The authors evaluate 12 detectors (8 open-source and 4 closed-source) using RAID and find that current detectors are easily fooled by adversarial attacks, variations in sampling strategies, repetition penalties, and unseen generative models. They release the dataset and a leaderboard to encourage future research and improve the robustness of detectors. The study highlights the need for more robust and general-purpose detectors to address the growing threat of machine-generated content.
Reach us at info@study.space