CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing

CFPL-FAS: Class Free Prompt Learning for Generalizable Face Anti-spoofing

21 Mar 2024 | Ajian Liu, Shuai Xue, Jianwen Gan, Jun Wan, Yanyan Liang, Jiankang Deng, Sergio Escalera, Zhen Lei
The paper "CFPL: Class Free Prompt Learning for Generalizable Face Anti-spoofing" addresses the challenge of domain generalization (DG) in face anti-spoofing (FAS) to improve model performance on unseen domains. Traditional methods often rely on domain labels or disentangle generalizable features, leading to semantic distortion and limited generalization. The proposed Class Free Prompt Learning (CFPL) framework leverages large-scale vision-language models like CLIP to dynamically adjust classifier weights using textual features. CFPL introduces two lightweight transformers, Content Q-Former (CQF) and Style Q-Former (SQF), to learn semantic prompts conditioned on content and style features. The framework includes two key improvements: Prompt-Text Matched (PTM) supervision to ensure CQF learns informative visual representations, and Diversified Style Prompt (DSP) technology to diversify style prompts by mixing feature statistics. The learned text features modulate visual features through Prompt Modulation (PM) to enhance generalization. Extensive experiments on cross-domain datasets demonstrate the effectiveness of CFPL, outperforming state-of-the-art methods. The main contributions include the first exploration of DG FAS via textual prompt learning, the introduction of CQF and SQF, and the optimization of text supervision, DSP, and PM.The paper "CFPL: Class Free Prompt Learning for Generalizable Face Anti-spoofing" addresses the challenge of domain generalization (DG) in face anti-spoofing (FAS) to improve model performance on unseen domains. Traditional methods often rely on domain labels or disentangle generalizable features, leading to semantic distortion and limited generalization. The proposed Class Free Prompt Learning (CFPL) framework leverages large-scale vision-language models like CLIP to dynamically adjust classifier weights using textual features. CFPL introduces two lightweight transformers, Content Q-Former (CQF) and Style Q-Former (SQF), to learn semantic prompts conditioned on content and style features. The framework includes two key improvements: Prompt-Text Matched (PTM) supervision to ensure CQF learns informative visual representations, and Diversified Style Prompt (DSP) technology to diversify style prompts by mixing feature statistics. The learned text features modulate visual features through Prompt Modulation (PM) to enhance generalization. Extensive experiments on cross-domain datasets demonstrate the effectiveness of CFPL, outperforming state-of-the-art methods. The main contributions include the first exploration of DG FAS via textual prompt learning, the introduction of CQF and SQF, and the optimization of text supervision, DSP, and PM.
Reach us at info@study.space