5 Jul 2024 | Dingkang Yang, Mingcheng Li, Dongling Xiao, Yang Liu, Kun Yang, Zhaoyu Chen, Yuzheng Wang, Peng Zhai, Ke Li, and Lihua Zhang
This paper proposes a Multimodal Counterfactual Inference Sentiment (MCIS) framework for Multimodal Sentiment Analysis (MSA) to address dataset biases. MSA aims to understand human intentions by integrating emotion-related clues from diverse modalities, such as visual, language, and audio. However, current MSA models suffer from harmful biases, including multimodal utterance-level label bias and word-level context bias, which can mislead models to focus on statistical shortcuts and spurious correlations, leading to performance bottlenecks. The MCIS framework is based on causality rather than conventional likelihood, using a causal graph to identify harmful biases and counterfactual inference to purify and mitigate them. The framework is parameter-free and training-free, allowing already-trained models to benefit from the causal graph. During inference, MCIS compares factual and counterfactual outcomes to make unbiased decisions. The framework is evaluated on several standard MSA benchmarks, showing significant improvements in performance. The main contributions include identifying and disentangling label and context biases from a causal inference perspective, proposing a general causality-based framework suitable for different MSA architectures, and demonstrating the effectiveness of the framework through comprehensive experiments. The framework effectively eliminates the side effects of dataset biases, making a step towards unbiased prediction in MSA.This paper proposes a Multimodal Counterfactual Inference Sentiment (MCIS) framework for Multimodal Sentiment Analysis (MSA) to address dataset biases. MSA aims to understand human intentions by integrating emotion-related clues from diverse modalities, such as visual, language, and audio. However, current MSA models suffer from harmful biases, including multimodal utterance-level label bias and word-level context bias, which can mislead models to focus on statistical shortcuts and spurious correlations, leading to performance bottlenecks. The MCIS framework is based on causality rather than conventional likelihood, using a causal graph to identify harmful biases and counterfactual inference to purify and mitigate them. The framework is parameter-free and training-free, allowing already-trained models to benefit from the causal graph. During inference, MCIS compares factual and counterfactual outcomes to make unbiased decisions. The framework is evaluated on several standard MSA benchmarks, showing significant improvements in performance. The main contributions include identifying and disentangling label and context biases from a causal inference perspective, proposing a general causality-based framework suitable for different MSA architectures, and demonstrating the effectiveness of the framework through comprehensive experiments. The framework effectively eliminates the side effects of dataset biases, making a step towards unbiased prediction in MSA.