[slides and audio] Authorship Obfuscation in Multilingual Machine-Generated Text Detection

This paper investigates the effectiveness of authorship obfuscation (AO) methods in evading detection of machine-generated text (MGT) in multilingual settings. The authors evaluate 10 AO methods across 37 MGT detection methods in 11 languages, resulting in 4,070 combinations. They find that all tested AO methods can cause evasion of automated detection in all tested languages, with homoglyph attacks being especially successful. However, some AO methods severely damage the text, making it unreadable or easily recognizable by humans. The authors also evaluate the effect of data augmentation on adversarial robustness using obfuscated texts and find that simple data augmentation can improve the robustness of detectors to AO techniques. They also provide a new public dataset of 740k obfuscated texts. The results indicate that existing MGT detection methods are vulnerable to AO methods, particularly in multilingual settings. The authors conclude that adversarial retraining using obfuscated texts can significantly increase the adversarial robustness of detectors, especially against homoglyph and paraphrasing attacks. The study highlights the need for more robust MGT detection methods in multilingual settings.This paper investigates the effectiveness of authorship obfuscation (AO) methods in evading detection of machine-generated text (MGT) in multilingual settings. The authors evaluate 10 AO methods across 37 MGT detection methods in 11 languages, resulting in 4,070 combinations. They find that all tested AO methods can cause evasion of automated detection in all tested languages, with homoglyph attacks being especially successful. However, some AO methods severely damage the text, making it unreadable or easily recognizable by humans. The authors also evaluate the effect of data augmentation on adversarial robustness using obfuscated texts and find that simple data augmentation can improve the robustness of detectors to AO techniques. They also provide a new public dataset of 740k obfuscated texts. The results indicate that existing MGT detection methods are vulnerable to AO methods, particularly in multilingual settings. The authors conclude that adversarial retraining using obfuscated texts can significantly increase the adversarial robustness of detectors, especially against homoglyph and paraphrasing attacks. The study highlights the need for more robust MGT detection methods in multilingual settings.

Authorship Obfuscation in Multilingual Machine-Generated Text Detection

18 Jun 2024 | Dominik Macko, Robert Moro, Adaku Uchendu, Ivan Srba, Jason Samuel Lucas, Michiharu Yamashita, Nafis Irtiza Tripto, Dongwon Lee, Jakub Simko, Maria Bielikova