1 June 2024 | Xianxun Zhu, Chaopeng Guo, Heyang Feng, Yao Huang, Yichen Feng, Xiangyang Wang, Rui Wang
The paper provides a comprehensive review of key technologies for emotion analysis using multimodal information, emphasizing the integration of speech, text, images, and physiological signals. It offers an overview of relevant literature, academic forums, and competitions, focusing on unimodal processing methods and multimodal data fusion techniques. The review highlights the importance of analyzing emotions across multiple modalities, discusses challenges such as dataset creation, modality synchronization, and limited data scenarios, and provides practical solutions. The paper also covers emotion elicitation, expression, and representation models, and explores applications in areas like driver sentiment detection and medical evaluations. It serves as a valuable resource for scholars and industry professionals, outlining current research and potential future directions in deep multimodal emotion analysis.The paper provides a comprehensive review of key technologies for emotion analysis using multimodal information, emphasizing the integration of speech, text, images, and physiological signals. It offers an overview of relevant literature, academic forums, and competitions, focusing on unimodal processing methods and multimodal data fusion techniques. The review highlights the importance of analyzing emotions across multiple modalities, discusses challenges such as dataset creation, modality synchronization, and limited data scenarios, and provides practical solutions. The paper also covers emotion elicitation, expression, and representation models, and explores applications in areas like driver sentiment detection and medical evaluations. It serves as a valuable resource for scholars and industry professionals, outlining current research and potential future directions in deep multimodal emotion analysis.