Facial Affective Behavior Analysis with Instruction Tuning

Facial Affective Behavior Analysis with Instruction Tuning

12 Jul 2024 | Yifan Li, Anh Dao, Wentao Bao, Zhen Tan, Tianlong Chen, Huan Liu, and Yu Kong
The paper addresses the challenges of facial affective behavior analysis (FABA) using Multi-modal Large Language Models (MLLMs). It introduces an instruction-following FABA dataset, *FABA-Instruct*, which includes 19K in-the-wild aligned face images with 30K fine-grained emotion and Action Unit (AU) annotations. Based on this dataset, a new benchmark, *FABA-Bench*, is proposed to evaluate both visual recognition and text generation performance of MLLMs on FABA tasks. Additionally, a novel MLLM, *EmoLA*, is introduced, which incorporates a facial prior expert and a low-rank adaptation module to enhance the recognition ability. Extensive experiments on traditional FABA datasets and *FABA-Bench* demonstrate the effectiveness of EmoLA, achieving competitive or superior results compared to state-of-the-art models. The paper also includes an ablation study to validate the contributions of the facial prior token and tuning strategies.The paper addresses the challenges of facial affective behavior analysis (FABA) using Multi-modal Large Language Models (MLLMs). It introduces an instruction-following FABA dataset, *FABA-Instruct*, which includes 19K in-the-wild aligned face images with 30K fine-grained emotion and Action Unit (AU) annotations. Based on this dataset, a new benchmark, *FABA-Bench*, is proposed to evaluate both visual recognition and text generation performance of MLLMs on FABA tasks. Additionally, a novel MLLM, *EmoLA*, is introduced, which incorporates a facial prior expert and a low-rank adaptation module to enhance the recognition ability. Extensive experiments on traditional FABA datasets and *FABA-Bench* demonstrate the effectiveness of EmoLA, achieving competitive or superior results compared to state-of-the-art models. The paper also includes an ablation study to validate the contributions of the facial prior token and tuning strategies.
Reach us at info@study.space
[slides and audio] Facial Affective Behavior Analysis with Instruction Tuning