3 Jul 2024 | Jared Moore, Tanvi Deshpande, Diyi Yang
Are Large Language Models Consistent over Value-laden Questions?
Jared Moore, Tanvi Deshpande, and Diyi Yang from Stanford University investigate whether large language models (LLMs) are consistent in their responses to value-laden questions. They define value consistency as the similarity of answers across paraphrases of one question, related questions under one topic, multiple-choice and open-ended use-cases, and multilingual translations of a question. Using a dataset of over 8,000 questions across 300 topics and four languages, they find that LLMs are relatively consistent across these measures, performing as or better than human participants on topic and paraphrase consistency. However, some inconsistencies remain, particularly on controversial topics like euthanasia compared to less controversial ones like women's rights. Base models are more consistent than fine-tuned models, and fine-tuned models show more inconsistency on some topics than others. The study also finds that models are more consistent on uncontroversial questions than on controversial ones. While models show some consistency, they are not fully consistent across all value-laden domains, and there is no evidence that models can be steered to particular values. The study highlights the importance of understanding model consistency in value-laden domains and the need for further research in this area.Are Large Language Models Consistent over Value-laden Questions?
Jared Moore, Tanvi Deshpande, and Diyi Yang from Stanford University investigate whether large language models (LLMs) are consistent in their responses to value-laden questions. They define value consistency as the similarity of answers across paraphrases of one question, related questions under one topic, multiple-choice and open-ended use-cases, and multilingual translations of a question. Using a dataset of over 8,000 questions across 300 topics and four languages, they find that LLMs are relatively consistent across these measures, performing as or better than human participants on topic and paraphrase consistency. However, some inconsistencies remain, particularly on controversial topics like euthanasia compared to less controversial ones like women's rights. Base models are more consistent than fine-tuned models, and fine-tuned models show more inconsistency on some topics than others. The study also finds that models are more consistent on uncontroversial questions than on controversial ones. While models show some consistency, they are not fully consistent across all value-laden domains, and there is no evidence that models can be steered to particular values. The study highlights the importance of understanding model consistency in value-laden domains and the need for further research in this area.