13 Feb 2024 | Huanran Chen, Yinpeng Dong, Shitong Shao, Zhongkai Hao, Xiao Yang, Hang Su, Jun Zhu
This paper addresses the robustness of diffusion models in classification tasks, particularly in the context of adversarial attacks. The authors propose a new family of diffusion classifiers called Noised Diffusion Classifiers (NDCs), which achieve state-of-the-art certified robustness. NDCs are designed to classify Gaussian-corrupted data by deriving evidence lower bounds (ELBOs) for these distributions, approximating the likelihood using the ELBO, and calculating classification probabilities via Bayes' theorem. The integration of NDCs with randomized smoothing results in classifiers with non-constant Lipschitzness, leading to tighter certified robustness radii. Experimental results on the CIFAR-10 dataset demonstrate that NDCs achieve 82.2%, 70.7%, and 54.5% certified robustness at $\ell_2$ radii of 0.25, 0.5, and 0.75, respectively, outperforming previous methods. The approach also shows superior performance on the ImageNet dataset, achieving significant efficiency gains without compromising robustness. The paper provides a comprehensive theoretical analysis and practical improvements, contributing to a deeper understanding of the robustness of diffusion classifiers.This paper addresses the robustness of diffusion models in classification tasks, particularly in the context of adversarial attacks. The authors propose a new family of diffusion classifiers called Noised Diffusion Classifiers (NDCs), which achieve state-of-the-art certified robustness. NDCs are designed to classify Gaussian-corrupted data by deriving evidence lower bounds (ELBOs) for these distributions, approximating the likelihood using the ELBO, and calculating classification probabilities via Bayes' theorem. The integration of NDCs with randomized smoothing results in classifiers with non-constant Lipschitzness, leading to tighter certified robustness radii. Experimental results on the CIFAR-10 dataset demonstrate that NDCs achieve 82.2%, 70.7%, and 54.5% certified robustness at $\ell_2$ radii of 0.25, 0.5, and 0.75, respectively, outperforming previous methods. The approach also shows superior performance on the ImageNet dataset, achieving significant efficiency gains without compromising robustness. The paper provides a comprehensive theoretical analysis and practical improvements, contributing to a deeper understanding of the robustness of diffusion classifiers.