27 Jun 2024 | Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohammed Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov
The paper introduces M4GT-Bench, a new benchmark for detecting machine-generated text (MGT) across multiple languages, domains, and generators. The benchmark includes three tasks: binary MGT detection, multi-way generator detection, and mixed human-machine text detection. The authors evaluate several MGT detection baselines and human performance, finding that good performance often requires access to training data from the same domain and generators. The benchmark is designed to address the challenges posed by the increasing prevalence of LLM-generated content, which can be misused for disinformation, academic fraud, and other societal issues. The paper also discusses the limitations of current detection methods and suggests future research directions, including the development of more robust systems and the exploration of adversarial attacks.The paper introduces M4GT-Bench, a new benchmark for detecting machine-generated text (MGT) across multiple languages, domains, and generators. The benchmark includes three tasks: binary MGT detection, multi-way generator detection, and mixed human-machine text detection. The authors evaluate several MGT detection baselines and human performance, finding that good performance often requires access to training data from the same domain and generators. The benchmark is designed to address the challenges posed by the increasing prevalence of LLM-generated content, which can be misused for disinformation, academic fraud, and other societal issues. The paper also discusses the limitations of current detection methods and suggests future research directions, including the development of more robust systems and the exploration of adversarial attacks.