[slides] Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection

The paper "Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection" addresses the challenge of detecting subtle geometric surface anomalies in industrial settings, where only 2D RGB data is typically available. The authors propose a novel method called Local-to-global Self-supervised Feature Adaptation (LSFA) to enhance the performance of multi-modal anomaly detection. LSFA focuses on fine-tuning pre-trained models (e.g., ImageNet) to better adapt to the specific domain of industrial data, addressing the domain gap issue. The method optimizes both intra-modal and cross-modal features, ensuring high-quality and consistent representations for anomaly detection. Key contributions include: 1. **LSFA Framework**: LSFA adaptively fine-tunes pre-trained models to learn task-oriented representations, improving the detection of subtle anomalies. 2. **Intra-modal Feature Compactness Optimization (IFC)**: Enhances feature compactness at both patch and prototype levels using dynamic-updated memory banks. 3. **Cross-modal Local-to-global Consistency Alignment (CLC)**: Alleviates cross-modal misalignment by aligning features at both patch and object levels, enhancing multi-modal information interaction. Experiments on the MVtec-3D AD and Eyecandies datasets demonstrate that LSFA significantly outperforms previous state-of-the-art methods, achieving a 97.1% 1-AUROC on MVtec-3D, a 3.4% improvement over the previous best method. The method also shows robustness in few-shot settings and compares favorably with fine-tuning methods like LoRA and AdaLoRA.The paper "Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection" addresses the challenge of detecting subtle geometric surface anomalies in industrial settings, where only 2D RGB data is typically available. The authors propose a novel method called Local-to-global Self-supervised Feature Adaptation (LSFA) to enhance the performance of multi-modal anomaly detection. LSFA focuses on fine-tuning pre-trained models (e.g., ImageNet) to better adapt to the specific domain of industrial data, addressing the domain gap issue. The method optimizes both intra-modal and cross-modal features, ensuring high-quality and consistent representations for anomaly detection. Key contributions include: 1. **LSFA Framework**: LSFA adaptively fine-tunes pre-trained models to learn task-oriented representations, improving the detection of subtle anomalies. 2. **Intra-modal Feature Compactness Optimization (IFC)**: Enhances feature compactness at both patch and prototype levels using dynamic-updated memory banks. 3. **Cross-modal Local-to-global Consistency Alignment (CLC)**: Alleviates cross-modal misalignment by aligning features at both patch and object levels, enhancing multi-modal information interaction. Experiments on the MVtec-3D AD and Eyecandies datasets demonstrate that LSFA significantly outperforms previous state-of-the-art methods, achieving a 97.1% 1-AUROC on MVtec-3D, a 3.4% improvement over the previous best method. The method also shows robustness in few-shot settings and compares favorably with fine-tuning methods like LoRA and AdaLoRA.

Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection

17 Jan 2024 | Yuanpeng Tu, Boshen Zhang, Liang Liu, Yuxi Li, Xuhai Chen, Jiangning Zhang, Yabiao Wang, Chengjie Wang, Cai Rong Zhao