AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection

AdaCLIP: Adapting CLIP with Hybrid Learnable Prompts for Zero-Shot Anomaly Detection

22 Jul 2024 | Yunkang Cao, Jiangning Zhang, Luca Frittoli, Yuqi Cheng, Weiming Shen, Giacomo Boracchi
AdaCLIP is a novel zero-shot anomaly detection (ZSAD) method that leverages pre-trained vision-language models (VLMs) to identify anomalies in novel categories without requiring any training data for those categories. The method incorporates learnable hybrid prompts, including static and dynamic prompts, into the CLIP model to enhance its adaptability and performance. Static prompts are shared across all images, while dynamic prompts are generated for each test image, providing dynamic adaptation capabilities. The combination of these prompts, referred to as hybrid prompts, significantly improves ZSAD performance. Extensive experiments on 14 real-world datasets from industrial and medical domains demonstrate that AdaCLIP outperforms other ZSAD methods and generalizes better to different categories and domains. The study also highlights the importance of diverse auxiliary data and optimized prompts for enhanced generalization capacity. The code for AdaCLIP is available at <https://github.com/caoyunkang/AdaCLIP>.AdaCLIP is a novel zero-shot anomaly detection (ZSAD) method that leverages pre-trained vision-language models (VLMs) to identify anomalies in novel categories without requiring any training data for those categories. The method incorporates learnable hybrid prompts, including static and dynamic prompts, into the CLIP model to enhance its adaptability and performance. Static prompts are shared across all images, while dynamic prompts are generated for each test image, providing dynamic adaptation capabilities. The combination of these prompts, referred to as hybrid prompts, significantly improves ZSAD performance. Extensive experiments on 14 real-world datasets from industrial and medical domains demonstrate that AdaCLIP outperforms other ZSAD methods and generalizes better to different categories and domains. The study also highlights the importance of diverse auxiliary data and optimized prompts for enhanced generalization capacity. The code for AdaCLIP is available at <https://github.com/caoyunkang/AdaCLIP>.
Reach us at info@study.space