T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models

T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models

8 Sep 2024 | Yibo Miao, Yifan Zhu, Yinpeng Dong, Lijia Yu, Jun Zhu, Xiao-Shan Gao
T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models This paper introduces T2VSafetyBench, a new benchmark for evaluating the safety of text-to-video (T2V) models. The rapid development of T2V models, such as Sora, has raised concerns about potential security risks, including the generation of illegal or unethical content. Previous evaluations have primarily focused on video quality, but they have not adequately addressed the unique temporal risks inherent in video generation. T2VSafetyBench defines 12 critical aspects of video generation safety, including pornography, violence, gore, public figures, discrimination, political sensitivity, illegal activities, disturbing content, misinformation, copyright infringement, and temporal risk. A malicious prompt dataset is constructed, including real-world prompts, LLM-generated prompts, and jailbreak attack-based prompts. The dataset is manually screened and fine-tuned to ensure quality. The benchmark evaluates these aspects using GPT-4 and human assessments. The results show that no single model excels in all aspects, with different models showing various strengths. The correlation between GPT-4 assessments and manual reviews is generally high, indicating that GPT-4 can be effectively used for large-scale evaluations. However, there is a trade-off between the usability and safety of T2V models. As video generation technology advances, safety risks are likely to increase, highlighting the urgency of prioritizing video safety. T2VSafetyBench provides insights into the safety of video generation in the era of generative AI. The benchmark includes a comprehensive evaluation of various T2V models, such as Pika, Gen2, Stable Video Diffusion, and Open-Sora. The results show that each model has distinct strengths and weaknesses in different aspects. For example, Gen2 performs well in mitigating sexual content and disturbing content, while Pika shows strong defensive capabilities in political sensitivity and copyright-related areas. The benchmark also highlights the importance of temporal risk, which is a unique security risk for T2V models. The results suggest that as models become more capable, the risk of generating unsafe content may increase unless explicitly handled. T2VSafetyBench aims to provide a comprehensive understanding of the safety of video generation and improve its safety in the future.T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models This paper introduces T2VSafetyBench, a new benchmark for evaluating the safety of text-to-video (T2V) models. The rapid development of T2V models, such as Sora, has raised concerns about potential security risks, including the generation of illegal or unethical content. Previous evaluations have primarily focused on video quality, but they have not adequately addressed the unique temporal risks inherent in video generation. T2VSafetyBench defines 12 critical aspects of video generation safety, including pornography, violence, gore, public figures, discrimination, political sensitivity, illegal activities, disturbing content, misinformation, copyright infringement, and temporal risk. A malicious prompt dataset is constructed, including real-world prompts, LLM-generated prompts, and jailbreak attack-based prompts. The dataset is manually screened and fine-tuned to ensure quality. The benchmark evaluates these aspects using GPT-4 and human assessments. The results show that no single model excels in all aspects, with different models showing various strengths. The correlation between GPT-4 assessments and manual reviews is generally high, indicating that GPT-4 can be effectively used for large-scale evaluations. However, there is a trade-off between the usability and safety of T2V models. As video generation technology advances, safety risks are likely to increase, highlighting the urgency of prioritizing video safety. T2VSafetyBench provides insights into the safety of video generation in the era of generative AI. The benchmark includes a comprehensive evaluation of various T2V models, such as Pika, Gen2, Stable Video Diffusion, and Open-Sora. The results show that each model has distinct strengths and weaknesses in different aspects. For example, Gen2 performs well in mitigating sexual content and disturbing content, while Pika shows strong defensive capabilities in political sensitivity and copyright-related areas. The benchmark also highlights the importance of temporal risk, which is a unique security risk for T2V models. The results suggest that as models become more capable, the risk of generating unsafe content may increase unless explicitly handled. T2VSafetyBench aims to provide a comprehensive understanding of the safety of video generation and improve its safety in the future.
Reach us at info@study.space
[slides and audio] T2VSafetyBench%3A Evaluating the Safety of Text-to-Video Generative Models