[slides and audio] Quantifying the Persona Effect in LLM Simulations

This study investigates the impact of integrating persona variables—demographic, social, and behavioral factors—on large language models (LLMs) in simulating diverse perspectives. The research finds that persona variables account for less than 10% of the variance in annotations in existing subjective NLP datasets. However, incorporating persona variables via prompting in LLMs provides modest but statistically significant improvements, particularly in samples where annotators disagree but their disagreements are relatively minor. A linear relationship is observed: the stronger the correlation between persona variables and human annotations, the more accurate the LLM predictions using persona prompting. In a zero-shot setting, a powerful 70b model with persona prompting captures 81% of the annotation variance achievable by linear regression trained on ground truth annotations. However, for most subjective NLP datasets, where persona variables have limited explanatory power, the benefits of persona prompting are limited. The study also explores the effectiveness of persona prompting in different sample types and the relationship between the importance of persona variables and the simulation capabilities of LLMs. The findings suggest that while persona prompting can improve LLMs' predictions, it may not be sufficient for reliable simulations in many NLP tasks. The study concludes with recommendations for cautious use of LLMs in simulations and the need for more strategic dataset design to enhance the fidelity of simulations.This study investigates the impact of integrating persona variables—demographic, social, and behavioral factors—on large language models (LLMs) in simulating diverse perspectives. The research finds that persona variables account for less than 10% of the variance in annotations in existing subjective NLP datasets. However, incorporating persona variables via prompting in LLMs provides modest but statistically significant improvements, particularly in samples where annotators disagree but their disagreements are relatively minor. A linear relationship is observed: the stronger the correlation between persona variables and human annotations, the more accurate the LLM predictions using persona prompting. In a zero-shot setting, a powerful 70b model with persona prompting captures 81% of the annotation variance achievable by linear regression trained on ground truth annotations. However, for most subjective NLP datasets, where persona variables have limited explanatory power, the benefits of persona prompting are limited. The study also explores the effectiveness of persona prompting in different sample types and the relationship between the importance of persona variables and the simulation capabilities of LLMs. The findings suggest that while persona prompting can improve LLMs' predictions, it may not be sufficient for reliable simulations in many NLP tasks. The study concludes with recommendations for cautious use of LLMs in simulations and the need for more strategic dataset design to enhance the fidelity of simulations.

Quantifying the Persona Effect in LLM Simulations

17 Jun 2024 | Tiancheng Hu, Nigel Collier