11 Jun 2024 | Yichi Zhang, Yao Huang, Yitong Sun, Chang Liu, Zhe Zhao, Zhengwei Fang, Yifan Wang, Huanran Chen, Xiao Yang, Xingxing Wei, Hang Su, Yinpeng Dong, Jun Zhu
This paper introduces MultiTrust, the first comprehensive benchmark for evaluating the trustworthiness of Multimodal Large Language Models (MLLMs) across five key aspects: truthfulness, safety, robustness, fairness, and privacy. The benchmark includes 32 diverse tasks with self-curated datasets and evaluates 21 modern MLLMs, revealing previously unexplored trustworthiness issues and risks. The study highlights the complexities introduced by multimodality, such as the vulnerability of models to multimodal jailbreaking and adversarial attacks, and the tendency of MLLMs to disclose privacy or reveal biases even when paired with irrelevant images. The paper also presents a scalable toolbox for standardized trustworthiness research, aiming to facilitate future advancements in this important field. The results show that proprietary models generally perform better in terms of trustworthiness, while open-source models still have significant gaps. The study emphasizes the need for more comprehensive evaluation frameworks and advanced methodologies to enhance the reliability of MLLMs. The findings underscore the importance of addressing both technical and ethical aspects of trustworthiness in MLLMs to ensure their safe and responsible deployment.This paper introduces MultiTrust, the first comprehensive benchmark for evaluating the trustworthiness of Multimodal Large Language Models (MLLMs) across five key aspects: truthfulness, safety, robustness, fairness, and privacy. The benchmark includes 32 diverse tasks with self-curated datasets and evaluates 21 modern MLLMs, revealing previously unexplored trustworthiness issues and risks. The study highlights the complexities introduced by multimodality, such as the vulnerability of models to multimodal jailbreaking and adversarial attacks, and the tendency of MLLMs to disclose privacy or reveal biases even when paired with irrelevant images. The paper also presents a scalable toolbox for standardized trustworthiness research, aiming to facilitate future advancements in this important field. The results show that proprietary models generally perform better in terms of trustworthiness, while open-source models still have significant gaps. The study emphasizes the need for more comprehensive evaluation frameworks and advanced methodologies to enhance the reliability of MLLMs. The findings underscore the importance of addressing both technical and ethical aspects of trustworthiness in MLLMs to ensure their safe and responsible deployment.