5 Feb 2024 | Yang Sui, Huy Phan, Jinqi Xiao, Tianfang Zhang, Zijie Tang, Cong Shi, Yan Wang, Yingying Chen, Bo Yuan
This paper explores the detectability of backdoor attacks on diffusion models, a powerful and widely used generative AI technique. The authors systematically analyze the properties of trigger patterns in existing backdoor attacks and propose a low-cost trigger detection mechanism based on distribution discrepancy. They then develop a backdoor attack strategy that can learn stealthy triggers to evade the proposed detection scheme. Empirical evaluations across various diffusion models and datasets demonstrate the effectiveness of the proposed methods, achieving 100% detection rates for the proposed distribution-based detection method and nearly 100% detection pass rates for the detection-evading attack strategy. The paper contributes to the understanding and defense against backdoor attacks in diffusion models, enhancing their security and robustness.This paper explores the detectability of backdoor attacks on diffusion models, a powerful and widely used generative AI technique. The authors systematically analyze the properties of trigger patterns in existing backdoor attacks and propose a low-cost trigger detection mechanism based on distribution discrepancy. They then develop a backdoor attack strategy that can learn stealthy triggers to evade the proposed detection scheme. Empirical evaluations across various diffusion models and datasets demonstrate the effectiveness of the proposed methods, achieving 100% detection rates for the proposed distribution-based detection method and nearly 100% detection pass rates for the detection-evading attack strategy. The paper contributes to the understanding and defense against backdoor attacks in diffusion models, enhancing their security and robustness.