[slides] SPA-VL%3A A Comprehensive Safety Preference Alignment Dataset for Vision Language Model

The paper introduces SPA-VL, a comprehensive dataset designed for the safety alignment of Vision Language Models (VLMs). SPA-VL covers 6 harmfulness domains, 13 categories, and 53 subcategories, containing 100,788 samples. The dataset includes questions, images, chosen responses, and rejected responses, ensuring diversity and depth in both content and model responses. The dataset is constructed through three stages: image collection, question generation, and preference collection. The images are sourced from the LAION-5B dataset, and questions are generated using Gemini 1.0 Pro Vision and other methods to ensure diversity and complexity. Preference data is collected by selecting the better response from two generated by different models, ensuring a balance between harmlessness and helpfulness. Experiments using PPO and DPO techniques on the SPA-VL dataset show significant improvements in harmlessness and helpfulness compared to baseline models. The results indicate that increasing the dataset scale, incorporating diverse responses, and using a mix of question types enhance the safety and performance of aligned models. The paper also discusses the impact of different factors, such as data scale, response model selection, question types, and model architecture, on the alignment performance. The findings highlight the importance of comprehensive datasets in achieving robust safety alignment and ensuring the safe deployment of VLMs.The paper introduces SPA-VL, a comprehensive dataset designed for the safety alignment of Vision Language Models (VLMs). SPA-VL covers 6 harmfulness domains, 13 categories, and 53 subcategories, containing 100,788 samples. The dataset includes questions, images, chosen responses, and rejected responses, ensuring diversity and depth in both content and model responses. The dataset is constructed through three stages: image collection, question generation, and preference collection. The images are sourced from the LAION-5B dataset, and questions are generated using Gemini 1.0 Pro Vision and other methods to ensure diversity and complexity. Preference data is collected by selecting the better response from two generated by different models, ensuring a balance between harmlessness and helpfulness. Experiments using PPO and DPO techniques on the SPA-VL dataset show significant improvements in harmlessness and helpfulness compared to baseline models. The results indicate that increasing the dataset scale, incorporating diverse responses, and using a mix of question types enhance the safety and performance of aligned models. The paper also discusses the impact of different factors, such as data scale, response model selection, question types, and model architecture, on the alignment performance. The findings highlight the importance of comprehensive datasets in achieving robust safety alignment and ensuring the safe deployment of VLMs.

SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model

17 Jun 2024 | Yongting Zhang, Lu Chen, Guodong Zheng, Yifeng Gao, Rui Zheng, Jinlan Fu, Zhenfei Yin, Senjie Jin, Yu Qiao, Xuanjing Huang, Feng Zhao, Tao Gui, Jing Shao