Your Diffusion Model is Secretly a Certifiably Robust Classifier

Your Diffusion Model is Secretly a Certifiably Robust Classifier

13 Feb 2024 | Huanran Chen, Yinpeng Dong, Shitong Shao, Zhongkai Hao, Xiao Yang, Hang Su, Jun Zhu
This paper introduces Noised Diffusion Classifiers (NDCs), a new family of diffusion classifiers that achieve state-of-the-art certified robustness. The authors generalize diffusion classifiers to classify Gaussian-corrupted data by deriving evidence lower bounds (ELBOs) for these distributions, approximating the likelihood using the ELBO, and calculating classification probabilities via Bayes' theorem. They integrate these generalized diffusion classifiers with randomized smoothing to construct smoothed classifiers with non-constant Lipschitzness. Experimental results demonstrate the superior certified robustness of the proposed NDCs, achieving 80%+ and 70%+ certified robustness on CIFAR-10 under adversarial perturbations with ℓ₂ norms less than 0.25 and 0.5, respectively, using a single off-the-shelf diffusion model without additional data. The authors also propose two variants of NDCs, the Exact Posterior Noised Diffusion Classifier (EPNDC) and the Approximated Posterior Noised Diffusion Classifier (APNDC), which improve the robustness and scalability of diffusion classifiers. The proposed methods significantly reduce time complexity and enhance scalability for large datasets by using the same noisy samples for all classes and designing a search algorithm to exclude as many candidate classes as possible at the outset of classification. The results show that the proposed methods outperform previous state-of-the-art methods in certified robustness and clean accuracy, demonstrating the effectiveness of the approach in adversarial settings. The paper also discusses the theoretical foundations of diffusion classifiers, including their Lipschitzness and certified robustness, and highlights the benefits of combining diffusion classifiers with randomized smoothing to achieve tighter certified radii. The authors conclude that their findings contribute to a deeper understanding of diffusion classifiers in the context of adversarial robustness and help alleviate concerns regarding their robustness.This paper introduces Noised Diffusion Classifiers (NDCs), a new family of diffusion classifiers that achieve state-of-the-art certified robustness. The authors generalize diffusion classifiers to classify Gaussian-corrupted data by deriving evidence lower bounds (ELBOs) for these distributions, approximating the likelihood using the ELBO, and calculating classification probabilities via Bayes' theorem. They integrate these generalized diffusion classifiers with randomized smoothing to construct smoothed classifiers with non-constant Lipschitzness. Experimental results demonstrate the superior certified robustness of the proposed NDCs, achieving 80%+ and 70%+ certified robustness on CIFAR-10 under adversarial perturbations with ℓ₂ norms less than 0.25 and 0.5, respectively, using a single off-the-shelf diffusion model without additional data. The authors also propose two variants of NDCs, the Exact Posterior Noised Diffusion Classifier (EPNDC) and the Approximated Posterior Noised Diffusion Classifier (APNDC), which improve the robustness and scalability of diffusion classifiers. The proposed methods significantly reduce time complexity and enhance scalability for large datasets by using the same noisy samples for all classes and designing a search algorithm to exclude as many candidate classes as possible at the outset of classification. The results show that the proposed methods outperform previous state-of-the-art methods in certified robustness and clean accuracy, demonstrating the effectiveness of the approach in adversarial settings. The paper also discusses the theoretical foundations of diffusion classifiers, including their Lipschitzness and certified robustness, and highlights the benefits of combining diffusion classifiers with randomized smoothing to achieve tighter certified radii. The authors conclude that their findings contribute to a deeper understanding of diffusion classifiers in the context of adversarial robustness and help alleviate concerns regarding their robustness.
Reach us at info@study.space