30 May 2024 | Mingli Zhu, Siyuan Liang, Baoyuan Wu
This paper reveals a critical vulnerability in existing backdoor defense strategies, highlighting that backdoors may remain dormant rather than being fully eliminated after defense. The authors introduce the Backdoor Existence Coefficient (BEC) to measure the persistence of backdoors in defense models. They demonstrate that even after defense, backdoors can be re-activated during inference by modifying the original trigger with a small perturbation. The study proposes three scenarios for backdoor re-activation: white-box, black-box, and transfer attacks. The effectiveness of these methods is validated on both image classification and multimodal contrastive learning tasks. The results show that backdoor re-activation attacks can significantly increase the attack success rate, indicating that current defense mechanisms are insufficient. The paper emphasizes the need for more robust and advanced backdoor defense strategies in the future.This paper reveals a critical vulnerability in existing backdoor defense strategies, highlighting that backdoors may remain dormant rather than being fully eliminated after defense. The authors introduce the Backdoor Existence Coefficient (BEC) to measure the persistence of backdoors in defense models. They demonstrate that even after defense, backdoors can be re-activated during inference by modifying the original trigger with a small perturbation. The study proposes three scenarios for backdoor re-activation: white-box, black-box, and transfer attacks. The effectiveness of these methods is validated on both image classification and multimodal contrastive learning tasks. The results show that backdoor re-activation attacks can significantly increase the attack success rate, indicating that current defense mechanisms are insufficient. The paper emphasizes the need for more robust and advanced backdoor defense strategies in the future.