5 Mar 2024 | Xun Lin, Shuai Wang, Rizhao Cai, Yizhong Liu, Ying Fu, Zitong Yu, Wenzhong Tang, Alex Kot
This paper proposes a multi-modal domain generalized (MMDG) framework for face anti-spoofing (FAS) to address the challenges of modality unreliability and imbalance. The MMDG framework includes two key components: the Uncertainty-Guided Cross-Adapter (U-Adapter) and the Rebalanced Modality Gradient Modulation (ReGrad). The U-Adapter suppresses unreliable information during cross-modal feature fusion by leveraging uncertainty estimates, while ReGrad dynamically adjusts the gradients of all modalities to balance their convergence speeds. The framework also introduces a large-scale benchmark for evaluating multi-modal FAS under domain generalization scenarios. Extensive experiments demonstrate that MMDG outperforms state-of-the-art methods in terms of performance and generalizability. The U-Adapter and ReGrad are shown to be effective in addressing modality unreliability and imbalance, leading to improved detection accuracy in unseen deployment environments. The proposed framework is evaluated on multiple datasets and protocols, showing its effectiveness in handling various scenarios, including fixed modalities, missing modalities, and limited source domains. The results validate the generalizability of MMDG and its potential for practical deployment in real-world applications.This paper proposes a multi-modal domain generalized (MMDG) framework for face anti-spoofing (FAS) to address the challenges of modality unreliability and imbalance. The MMDG framework includes two key components: the Uncertainty-Guided Cross-Adapter (U-Adapter) and the Rebalanced Modality Gradient Modulation (ReGrad). The U-Adapter suppresses unreliable information during cross-modal feature fusion by leveraging uncertainty estimates, while ReGrad dynamically adjusts the gradients of all modalities to balance their convergence speeds. The framework also introduces a large-scale benchmark for evaluating multi-modal FAS under domain generalization scenarios. Extensive experiments demonstrate that MMDG outperforms state-of-the-art methods in terms of performance and generalizability. The U-Adapter and ReGrad are shown to be effective in addressing modality unreliability and imbalance, leading to improved detection accuracy in unseen deployment environments. The proposed framework is evaluated on multiple datasets and protocols, showing its effectiveness in handling various scenarios, including fixed modalities, missing modalities, and limited source domains. The results validate the generalizability of MMDG and its potential for practical deployment in real-world applications.