Affective Behaviour Analysis via Integrating Multi-Modal Knowledge

Affective Behaviour Analysis via Integrating Multi-Modal Knowledge

16 Mar 2024 | Wei Zhang, Feng Qiu, Chen Liu, Lincheng Li, Heming Du, Tiancheng Guo, Xin Yu
This paper presents a method for affective behavior analysis by integrating multi-modal knowledge. The 6th Affective Behavior Analysis in-the-wild (ABAW) competition evaluates techniques for analyzing human emotions in natural environments using five tasks: Valence-Arousal (VA) Estimation, Expression (EXPR) Recognition, Action Unit (AU) Detection, Compound Expression (CE) Recognition, and Emotional Mimicry Intensity (EMI) Estimation. The authors propose a method that integrates audio, visual, and textual information to extract high-quality features for downstream tasks. They use a transformer-based model to fuse multi-modal information and an ensemble learning strategy to improve generalization across different scenes. The method is evaluated on three datasets: Aff-Wild2, Hume-Vidmimic2, and C-EXPR-DB. The results show that the proposed method achieves superior performance across all tasks. The key contributions include the integration of a large-scale facial expression dataset, the use of a transformer-based multi-modal fusion model, and the application of an ensemble learning strategy to enhance model generalization. The method demonstrates the effectiveness of integrating multi-modal information for affective behavior analysis.This paper presents a method for affective behavior analysis by integrating multi-modal knowledge. The 6th Affective Behavior Analysis in-the-wild (ABAW) competition evaluates techniques for analyzing human emotions in natural environments using five tasks: Valence-Arousal (VA) Estimation, Expression (EXPR) Recognition, Action Unit (AU) Detection, Compound Expression (CE) Recognition, and Emotional Mimicry Intensity (EMI) Estimation. The authors propose a method that integrates audio, visual, and textual information to extract high-quality features for downstream tasks. They use a transformer-based model to fuse multi-modal information and an ensemble learning strategy to improve generalization across different scenes. The method is evaluated on three datasets: Aff-Wild2, Hume-Vidmimic2, and C-EXPR-DB. The results show that the proposed method achieves superior performance across all tasks. The key contributions include the integration of a large-scale facial expression dataset, the use of a transformer-based multi-modal fusion model, and the application of an ensemble learning strategy to enhance model generalization. The method demonstrates the effectiveness of integrating multi-modal information for affective behavior analysis.
Reach us at info@study.space
[slides and audio] Affective Behaviour Analysis via Integrating Multi-Modal Knowledge