Authorship Obfuscation in Multilingual Machine-Generated Text Detection

Authorship Obfuscation in Multilingual Machine-Generated Text Detection

18 Jun 2024 | Dominik Macko, Robert Moro, Adaku Uchendu, Ivan Srba, Jason Samuel Lucas, Michiharu Yamashita, Nafis Irtiza Tripto, Dongwon Lee, Jakub Simko, Maria Bielikova
The paper "Authorship Obfuscation in Multilingual Machine-Generated Text Detection" by Dominik Macko et al. addresses the issue of authorship obfuscation (AO) methods, which can evade detection of machine-generated text (MGT). The authors conduct a comprehensive benchmarking of 10 AO methods against 37 MGT detection methods in 11 languages, evaluating 4,070 combinations. They find that all tested AO methods can cause evasion of automated detection, with homoglyph attacks being particularly effective. However, some AO methods severely damage the readability of the texts, making them difficult for humans to recognize. The study also evaluates the robustness of MGT detectors against adversarial perturbations and the effect of data augmentation using obfuscated texts. The results show that simple data augmentation can improve the robustness of detectors against AO techniques. The paper provides a new multilingual dataset of 740k obfuscated texts and discusses the limitations and ethical considerations of the work. The findings highlight the need for more robust MGT detection methods to counter the evolving threats posed by AO techniques.The paper "Authorship Obfuscation in Multilingual Machine-Generated Text Detection" by Dominik Macko et al. addresses the issue of authorship obfuscation (AO) methods, which can evade detection of machine-generated text (MGT). The authors conduct a comprehensive benchmarking of 10 AO methods against 37 MGT detection methods in 11 languages, evaluating 4,070 combinations. They find that all tested AO methods can cause evasion of automated detection, with homoglyph attacks being particularly effective. However, some AO methods severely damage the readability of the texts, making them difficult for humans to recognize. The study also evaluates the robustness of MGT detectors against adversarial perturbations and the effect of data augmentation using obfuscated texts. The results show that simple data augmentation can improve the robustness of detectors against AO techniques. The paper provides a new multilingual dataset of 740k obfuscated texts and discusses the limitations and ethical considerations of the work. The findings highlight the need for more robust MGT detection methods to counter the evolving threats posed by AO techniques.
Reach us at info@study.space
[slides] Authorship Obfuscation in Multilingual Machine-Generated Text Detection | StudySpace