2020 | Di Jin, Zhijing Jin, Joey Tianyi Zhou, Peter Szolovits
The paper "Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment" by Di Jin, Zhijing Jin, Joey Tianyi Zhou, and Peter Szolovits presents TEXTFOOLER, a novel method for generating adversarial text in the black-box setting. TEXTFOOLER is designed to attack natural language processing (NLP) models, particularly focusing on text classification and textual entailment tasks. The authors demonstrate that their method can effectively reduce the accuracy of state-of-the-art models, including BERT, with a limited number of perturbations, while preserving semantic content, grammaticality, and human classification accuracy. The key contributions of TEXTFOOLER include its effectiveness, utility preservation, and efficiency in generating adversarial text. The paper also includes a comprehensive evaluation of TEXTFOOLER on various datasets and models, showing its superior performance compared to previous adversarial attack methods. Additionally, the authors conduct human evaluations to validate the quality of the generated adversarial examples and discuss the transferability and adversarial training of the models.The paper "Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment" by Di Jin, Zhijing Jin, Joey Tianyi Zhou, and Peter Szolovits presents TEXTFOOLER, a novel method for generating adversarial text in the black-box setting. TEXTFOOLER is designed to attack natural language processing (NLP) models, particularly focusing on text classification and textual entailment tasks. The authors demonstrate that their method can effectively reduce the accuracy of state-of-the-art models, including BERT, with a limited number of perturbations, while preserving semantic content, grammaticality, and human classification accuracy. The key contributions of TEXTFOOLER include its effectiveness, utility preservation, and efficiency in generating adversarial text. The paper also includes a comprehensive evaluation of TEXTFOOLER on various datasets and models, showing its superior performance compared to previous adversarial attack methods. Additionally, the authors conduct human evaluations to validate the quality of the generated adversarial examples and discuss the transferability and adversarial training of the models.