SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model

SPA-VL: A Comprehensive Safety Preference Alignment Dataset for Vision Language Model

17 Jun 2024 | Yongting Zhang, Lu Chen, Guodong Zheng, Yifeng Gao, Rui Zheng, Jinlan Fu, Zhenfei Yin, Senjie Jin, Yu Qiao, Xuanjing Huang, Feng Zhao, Tao Gui, Jing Shao
SPA-VL is a comprehensive safety preference alignment dataset for Vision Language Models (VLMs), designed to address the challenges of aligning VLMs with human preferences and safety standards. The dataset covers 6 harmfulness domains, 13 categories, and 53 subcategories, containing 100,788 samples of the quadruple (question, image, chosen response, rejected response). It includes responses from 12 open-source and closed-source VLMs, ensuring diversity. The dataset is used to train models with alignment techniques, leading to significant improvements in harmlessness and helpfulness while maintaining core capabilities. SPA-VL is a large-scale, high-quality, and diverse dataset that represents a significant milestone in ensuring that VLMs achieve both harmlessness and helpfulness. The dataset is publicly available, and the code is also made available for research purposes. The paper discusses the challenges of aligning VLMs with human preferences and safety standards, and presents SPA-VL as a solution to these challenges. The dataset is used to train models with alignment techniques, leading to significant improvements in harmlessness and helpfulness while maintaining core capabilities. The paper also discusses the results of experiments on the dataset, showing that models trained with alignment techniques on SPA-VL exhibit substantial improvements in harmlessness and helpfulness. The paper also discusses the importance of comprehensive datasets like SPA-VL in achieving robust safety alignment, ensuring that VLMs can be effectively and safely deployed in real-world applications. The paper concludes that SPA-VL represents a significant step towards safer and more reliable vision-language models, paving the way for future research and development in this crucial area.SPA-VL is a comprehensive safety preference alignment dataset for Vision Language Models (VLMs), designed to address the challenges of aligning VLMs with human preferences and safety standards. The dataset covers 6 harmfulness domains, 13 categories, and 53 subcategories, containing 100,788 samples of the quadruple (question, image, chosen response, rejected response). It includes responses from 12 open-source and closed-source VLMs, ensuring diversity. The dataset is used to train models with alignment techniques, leading to significant improvements in harmlessness and helpfulness while maintaining core capabilities. SPA-VL is a large-scale, high-quality, and diverse dataset that represents a significant milestone in ensuring that VLMs achieve both harmlessness and helpfulness. The dataset is publicly available, and the code is also made available for research purposes. The paper discusses the challenges of aligning VLMs with human preferences and safety standards, and presents SPA-VL as a solution to these challenges. The dataset is used to train models with alignment techniques, leading to significant improvements in harmlessness and helpfulness while maintaining core capabilities. The paper also discusses the results of experiments on the dataset, showing that models trained with alignment techniques on SPA-VL exhibit substantial improvements in harmlessness and helpfulness. The paper also discusses the importance of comprehensive datasets like SPA-VL in achieving robust safety alignment, ensuring that VLMs can be effectively and safely deployed in real-world applications. The paper concludes that SPA-VL represents a significant step towards safer and more reliable vision-language models, paving the way for future research and development in this crucial area.
Reach us at info@study.space
[slides] SPA-VL%3A A Comprehensive Safety Preference Alignment Dataset for Vision Language Model | StudySpace