[slides] Social Bias Evaluation for Large Language Models Requires Prompt Variations

This paper investigates the sensitivity of large language models (LLMs) to prompt variations in evaluating task performance and social bias. The study highlights that LLMs are highly sensitive to prompt variations, which can significantly affect their performance and bias scores. The research uses the BBQ dataset to evaluate how different prompt formats, such as task instructions, few-shot examples, and debias-prompt, influence LLMs' outputs. The findings reveal that LLMs exhibit trade-offs between task performance and social bias, with some prompts leading to increased bias despite improved performance. The sensitivity of LLMs to prompts is attributed to the ambiguity of instances, which can lead to varied outputs. The study recommends using diverse prompts to better understand the effects of prompts on social bias in LLMs. The results show that prompt variations can significantly impact the ranking of LLMs and their ability to mitigate bias. The study also demonstrates that debias-prompt strategies can affect LLMs' performance and bias scores, with some prompts improving performance while others may worsen it. Overall, the research underscores the importance of considering prompt variations in evaluating and mitigating social bias in LLMs.This paper investigates the sensitivity of large language models (LLMs) to prompt variations in evaluating task performance and social bias. The study highlights that LLMs are highly sensitive to prompt variations, which can significantly affect their performance and bias scores. The research uses the BBQ dataset to evaluate how different prompt formats, such as task instructions, few-shot examples, and debias-prompt, influence LLMs' outputs. The findings reveal that LLMs exhibit trade-offs between task performance and social bias, with some prompts leading to increased bias despite improved performance. The sensitivity of LLMs to prompts is attributed to the ambiguity of instances, which can lead to varied outputs. The study recommends using diverse prompts to better understand the effects of prompts on social bias in LLMs. The results show that prompt variations can significantly impact the ranking of LLMs and their ability to mitigate bias. The study also demonstrates that debias-prompt strategies can affect LLMs' performance and bias scores, with some prompts improving performance while others may worsen it. Overall, the research underscores the importance of considering prompt variations in evaluating and mitigating social bias in LLMs.

Social Bias Evaluation for Large Language Models Requires Prompt Variations

3 Jul 2024 | Rem Hida, Masahiro Kaneko, Naoaki Okazaki