October 21–November 01, 2024 | Yuntao Shou, Tao Meng*, Fuchen Zhang, Nan Yin, Keqin Li
This paper revisits Multi-modal Emotion Recognition in Conversation (MERC) and proposes a novel method that integrates feature disentanglement and multi-modal feature fusion. The authors argue that long-range contextual semantic information should be extracted during feature disentanglement, and inter-modal semantic information consistency should be maximized during feature fusion. Inspired by State Space Models (SSMs), they introduce a Broad Mamba, which uses SSMs to efficiently model long-distance dependencies and a broad learning system to explore potential data distributions. Additionally, they propose a probability-guided fusion mechanism that utilizes predicted label probabilities to weight modal features, enhancing the fusion process. Experimental results on the IEMOCAP and MELD datasets show that the proposed method outperforms existing methods in terms of accuracy and efficiency, demonstrating its effectiveness and potential as a next-generation architecture for MERC.This paper revisits Multi-modal Emotion Recognition in Conversation (MERC) and proposes a novel method that integrates feature disentanglement and multi-modal feature fusion. The authors argue that long-range contextual semantic information should be extracted during feature disentanglement, and inter-modal semantic information consistency should be maximized during feature fusion. Inspired by State Space Models (SSMs), they introduce a Broad Mamba, which uses SSMs to efficiently model long-distance dependencies and a broad learning system to explore potential data distributions. Additionally, they propose a probability-guided fusion mechanism that utilizes predicted label probabilities to weight modal features, enhancing the fusion process. Experimental results on the IEMOCAP and MELD datasets show that the proposed method outperforms existing methods in terms of accuracy and efficiency, demonstrating its effectiveness and potential as a next-generation architecture for MERC.