PICLe: Eliciting Diverse Behaviors from Large Language Models with Persona In-Context Learning

PICLe: Eliciting Diverse Behaviors from Large Language Models with Persona In-Context Learning

2024 | Hyeong Kyu Choi, Yixuan Li
PICLe is a novel framework for eliciting diverse behaviors from large language models (LLMs) through persona in-context learning (ICL). The framework is grounded in Bayesian inference and uses a likelihood ratio-based criterion to select demonstrative examples that guide the model to align with a target persona. The core idea is to select examples that maximize the likelihood of the target persona, enabling the model to focus on the desired behavior. PICLe outperforms existing methods across three contemporary LLMs, achieving an average success rate of 88.1% on Llama-2, significantly higher than the baseline of 65.5%. The framework is robust to hyperparameter choices and computationally efficient compared to baseline methods. The method is evaluated using four metrics: action consistency, action confidence, action uncertainty, and degree of alteration. PICLe consistently performs well across these metrics, demonstrating its effectiveness in eliciting diverse personas. The framework is also shown to be effective on non-RLHF models like Vicuna and GPT-J, with PICLe improving performance significantly when applied. The method is further validated through ablation studies and experiments with different numbers of in-context examples, showing its robustness and efficiency. Overall, PICLe provides a novel approach to persona elicitation, offering a systematic way to customize LLM behaviors to align with specific personality traits.PICLe is a novel framework for eliciting diverse behaviors from large language models (LLMs) through persona in-context learning (ICL). The framework is grounded in Bayesian inference and uses a likelihood ratio-based criterion to select demonstrative examples that guide the model to align with a target persona. The core idea is to select examples that maximize the likelihood of the target persona, enabling the model to focus on the desired behavior. PICLe outperforms existing methods across three contemporary LLMs, achieving an average success rate of 88.1% on Llama-2, significantly higher than the baseline of 65.5%. The framework is robust to hyperparameter choices and computationally efficient compared to baseline methods. The method is evaluated using four metrics: action consistency, action confidence, action uncertainty, and degree of alteration. PICLe consistently performs well across these metrics, demonstrating its effectiveness in eliciting diverse personas. The framework is also shown to be effective on non-RLHF models like Vicuna and GPT-J, with PICLe improving performance significantly when applied. The method is further validated through ablation studies and experiments with different numbers of in-context examples, showing its robustness and efficiency. Overall, PICLe provides a novel approach to persona elicitation, offering a systematic way to customize LLM behaviors to align with specific personality traits.
Reach us at info@study.space