Revisiting Adversarial Training at Scale

Revisiting Adversarial Training at Scale

21 Apr 2024 | Zeyu Wang*, Xianhang Li*, Hongru Zhu, Cihang Xie
This paper presents AdvXL, a novel adversarial training framework that enables efficient and effective training of robust visual representations at scale. The authors address the challenge of scaling adversarial training to large models and web-scale datasets, which has been a bottleneck in the field. AdvXL introduces a two-stage training strategy: first, lightweight pre-training with reduced input size and weaker attacks, followed by intensive fine-tuning with full resolution and stronger attacks. This approach significantly reduces computational costs while maintaining or improving robustness. Empirical results show that AdvXL achieves state-of-the-art robust accuracy on ImageNet-1K under AutoAttack. For example, training on the DataComp-1B dataset, AdvXL surpasses previous records in $ l_{\infty} $, $ l_{2} $, and $ l_{1} $-robust accuracy by margins of 11.4%, 14.2%, and 12.9%, respectively. The framework also demonstrates strong generalization against unseen attacks, improving upon the previous best $ l_{2} $ and $ l_{1} $-robust accuracy of models trained to be $ l_{\infty} $-robust by 14% and 13%, respectively. The paper also explores the effectiveness of using a pre-trained CLIP text encoder to enable training on web-scale datasets with open text descriptions. This approach allows the model to learn from large-scale data without requiring precise labels, leveraging the semantic information provided by the text descriptions. The authors conduct extensive experiments to evaluate the performance of AdvXL across different model sizes, data scales, and training schedules. They find that scaling both the model and the data significantly improves robustness, with larger models and datasets leading to better performance. The results highlight the importance of scaling in adversarial training and demonstrate that AdvXL achieves a superior performance-compute trade-off. The paper also compares AdvXL with other state-of-the-art models, showing that it outperforms them in terms of robustness and efficiency. The findings suggest that adversarial training can be effectively scaled to large models and datasets, paving the way for the development of more robust and efficient visual models. AdvXL's approach provides a new direction for adversarial training, demonstrating its potential to advance the field of foundation models.This paper presents AdvXL, a novel adversarial training framework that enables efficient and effective training of robust visual representations at scale. The authors address the challenge of scaling adversarial training to large models and web-scale datasets, which has been a bottleneck in the field. AdvXL introduces a two-stage training strategy: first, lightweight pre-training with reduced input size and weaker attacks, followed by intensive fine-tuning with full resolution and stronger attacks. This approach significantly reduces computational costs while maintaining or improving robustness. Empirical results show that AdvXL achieves state-of-the-art robust accuracy on ImageNet-1K under AutoAttack. For example, training on the DataComp-1B dataset, AdvXL surpasses previous records in $ l_{\infty} $, $ l_{2} $, and $ l_{1} $-robust accuracy by margins of 11.4%, 14.2%, and 12.9%, respectively. The framework also demonstrates strong generalization against unseen attacks, improving upon the previous best $ l_{2} $ and $ l_{1} $-robust accuracy of models trained to be $ l_{\infty} $-robust by 14% and 13%, respectively. The paper also explores the effectiveness of using a pre-trained CLIP text encoder to enable training on web-scale datasets with open text descriptions. This approach allows the model to learn from large-scale data without requiring precise labels, leveraging the semantic information provided by the text descriptions. The authors conduct extensive experiments to evaluate the performance of AdvXL across different model sizes, data scales, and training schedules. They find that scaling both the model and the data significantly improves robustness, with larger models and datasets leading to better performance. The results highlight the importance of scaling in adversarial training and demonstrate that AdvXL achieves a superior performance-compute trade-off. The paper also compares AdvXL with other state-of-the-art models, showing that it outperforms them in terms of robustness and efficiency. The findings suggest that adversarial training can be effectively scaled to large models and datasets, paving the way for the development of more robust and efficient visual models. AdvXL's approach provides a new direction for adversarial training, demonstrating its potential to advance the field of foundation models.
Reach us at info@study.space