CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing

CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing

21 Mar 2024 | Ajian Liu, Shuai Xue, Jianwen Gan, Jun Wan, Yanyan Liang, Jiankang Deng, Sergio Escalera, Zhen Lei
This paper proposes a novel approach called Class Free Prompt Learning (CFPL) for Domain Generalization (DG) Face Anti-Spoofing (FAS). The goal is to improve the performance of FAS models on unseen domains by leveraging large-scale Vision-Language Models (VLMs) like CLIP. Traditional methods either rely on domain labels to align domain-invariant feature spaces or disentangle generalizable features from the whole sample, which can lead to semantic feature distortion and limited generalization. CFPL introduces two lightweight transformers, Content Q-Former (CQF) and Style Q-Former (SQF), to learn semantic prompts based on content and style features. These prompts are then used to dynamically adjust the classifier's weights for exploring generalizable visual features. The CFPL framework uses a Prompt-Text Matched (PTM) supervision to ensure CQF learns visual representations that are most informative of the content description. It also introduces a Diversified Style Prompt (DSP) technology to diversify the learning of style prompts by mixing feature statistics between instance-specific styles. Finally, the learned text features modulate visual features through the designed Prompt Modulation (PM) to generalize across domains. Extensive experiments show that CFPL outperforms state-of-the-art methods on several cross-domain datasets. The CFPL framework is built on CLIP, which allows for the use of text features as weights of the classifier to learn generalized visual features. The framework is evaluated on multiple benchmark datasets, including MSU-MFSD, CASIA-FASD, ReplayAttack, and OULU-NPU, demonstrating its effectiveness in cross-domain FAS tasks. The results show that CFPL achieves significant improvements in metrics such as HTER, AUC, and TPR@FPR=1% compared to existing methods. The CFPL approach is also shown to be effective in cross-domain scenarios where additional data is available, such as the CASIA-SURF and WMCA datasets. The framework's ability to adapt to different domains and improve generalization through text-based prompt learning makes it a promising approach for FAS tasks.This paper proposes a novel approach called Class Free Prompt Learning (CFPL) for Domain Generalization (DG) Face Anti-Spoofing (FAS). The goal is to improve the performance of FAS models on unseen domains by leveraging large-scale Vision-Language Models (VLMs) like CLIP. Traditional methods either rely on domain labels to align domain-invariant feature spaces or disentangle generalizable features from the whole sample, which can lead to semantic feature distortion and limited generalization. CFPL introduces two lightweight transformers, Content Q-Former (CQF) and Style Q-Former (SQF), to learn semantic prompts based on content and style features. These prompts are then used to dynamically adjust the classifier's weights for exploring generalizable visual features. The CFPL framework uses a Prompt-Text Matched (PTM) supervision to ensure CQF learns visual representations that are most informative of the content description. It also introduces a Diversified Style Prompt (DSP) technology to diversify the learning of style prompts by mixing feature statistics between instance-specific styles. Finally, the learned text features modulate visual features through the designed Prompt Modulation (PM) to generalize across domains. Extensive experiments show that CFPL outperforms state-of-the-art methods on several cross-domain datasets. The CFPL framework is built on CLIP, which allows for the use of text features as weights of the classifier to learn generalized visual features. The framework is evaluated on multiple benchmark datasets, including MSU-MFSD, CASIA-FASD, ReplayAttack, and OULU-NPU, demonstrating its effectiveness in cross-domain FAS tasks. The results show that CFPL achieves significant improvements in metrics such as HTER, AUC, and TPR@FPR=1% compared to existing methods. The CFPL approach is also shown to be effective in cross-domain scenarios where additional data is available, such as the CASIA-SURF and WMCA datasets. The framework's ability to adapt to different domains and improve generalization through text-based prompt learning makes it a promising approach for FAS tasks.
Reach us at info@study.space
Understanding CFPL-FAS%3A Class Free Prompt Learning for Generalizable Face Anti-Spoofing