Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment

23 Jan 2024 | Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou
This paper introduces Ditto, a self-alignment method for enhancing the role-playing capabilities of large language models (LLMs). The authors argue that LLMs inherently possess role-play capabilities due to their extensive training on character profiles and dialogues. Ditto leverages this inherent knowledge by creating a role-play training set with 4,000 characters, significantly larger than existing datasets. The method involves two main steps: collecting character profiles and simulating role-play dialogues as a reading comprehension task. The LLM is then fine-tuned using this self-generated dataset to improve its role-playing abilities. The evaluation of Ditto shows consistent role identity, accurate role-related knowledge, and cognitive boundary awareness across various parameter scales. Notably, Ditto outperforms existing open-source role-play baselines and achieves performance levels comparable to advanced proprietary chatbots like GPT-4. The paper also explores the impact of cross-supervision, revealing that consistent role identity can benefit from imitation learning, while knowledge-related metrics are more constrained by the LLM's inherent capabilities. The authors conclude by highlighting the importance of foundational models in achieving strong role-play performance and the need for further research in this area.This paper introduces Ditto, a self-alignment method for enhancing the role-playing capabilities of large language models (LLMs). The authors argue that LLMs inherently possess role-play capabilities due to their extensive training on character profiles and dialogues. Ditto leverages this inherent knowledge by creating a role-play training set with 4,000 characters, significantly larger than existing datasets. The method involves two main steps: collecting character profiles and simulating role-play dialogues as a reading comprehension task. The LLM is then fine-tuned using this self-generated dataset to improve its role-playing abilities. The evaluation of Ditto shows consistent role identity, accurate role-related knowledge, and cognitive boundary awareness across various parameter scales. Notably, Ditto outperforms existing open-source role-play baselines and achieves performance levels comparable to advanced proprietary chatbots like GPT-4. The paper also explores the impact of cross-supervision, revealing that consistent role identity can benefit from imitation learning, while knowledge-related metrics are more constrained by the LLM's inherent capabilities. The authors conclude by highlighting the importance of foundational models in achieving strong role-play performance and the need for further research in this area.
Reach us at info@study.space