Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness

Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness

10 Apr 2024 | Sibo Wang, Jie Zhang, Zheng Yuan, Shiguang Shan
This paper proposes a novel method called Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) to enhance the zero-shot adversarial robustness of large-scale vision-language models like CLIP. The method leverages the pre-trained model's generalization capabilities by introducing an auxiliary branch that minimizes the distance between adversarial examples in the target model and those in the pre-trained model. This helps preserve the generalization features captured during pre-training. PMG-AFT also incorporates a regularization loss to further enhance adversarial robustness. Extensive experiments on 15 zero-shot datasets show that PMG-AFT significantly outperforms the state-of-the-art method, improving the top-1 robust accuracy by an average of 4.99% and clean accuracy by an average of 8.72%. The method is effective in mitigating overfitting and maintaining the model's generalization ability. The results demonstrate that PMG-AFT achieves superior performance in both adversarial robustness and clean accuracy compared to existing methods. The paper also discusses related work, including pre-trained vision-language models, adversarial robustness, and fine-tuning and catastrophic overfitting. The methodology includes a detailed description of the PMG-AFT approach, including its components and loss function. The experiments show that PMG-AFT consistently outperforms other methods in terms of zero-shot robust accuracy and clean accuracy. The paper concludes that PMG-AFT is an effective method for enhancing the adversarial robustness of large-scale vision-language models.This paper proposes a novel method called Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) to enhance the zero-shot adversarial robustness of large-scale vision-language models like CLIP. The method leverages the pre-trained model's generalization capabilities by introducing an auxiliary branch that minimizes the distance between adversarial examples in the target model and those in the pre-trained model. This helps preserve the generalization features captured during pre-training. PMG-AFT also incorporates a regularization loss to further enhance adversarial robustness. Extensive experiments on 15 zero-shot datasets show that PMG-AFT significantly outperforms the state-of-the-art method, improving the top-1 robust accuracy by an average of 4.99% and clean accuracy by an average of 8.72%. The method is effective in mitigating overfitting and maintaining the model's generalization ability. The results demonstrate that PMG-AFT achieves superior performance in both adversarial robustness and clean accuracy compared to existing methods. The paper also discusses related work, including pre-trained vision-language models, adversarial robustness, and fine-tuning and catastrophic overfitting. The methodology includes a detailed description of the PMG-AFT approach, including its components and loss function. The experiments show that PMG-AFT consistently outperforms other methods in terms of zero-shot robust accuracy and clean accuracy. The paper concludes that PMG-AFT is an effective method for enhancing the adversarial robustness of large-scale vision-language models.
Reach us at info@study.space