Position: A Roadmap to Pluralistic Alignment

Position: A Roadmap to Pluralistic Alignment

2024 | Taylor Sorensen, Jared Moore, Jillian Fisher, Mitchell Gordon, Nilooofar Mireshghallah, Christopher Michael Rytting, Andre Ye, Liwei Jiang, Ximing Lu, Nouha Dziri, Tim Althoff, Yejin Choi
This paper proposes a roadmap for pluralistic alignment in AI systems, focusing on large language models (LLMs) as a testbed. It identifies three ways to define and operationalize pluralism in AI systems: Overton pluralism, which provides a spectrum of reasonable responses; steerable pluralism, which allows models to reflect certain perspectives; and distributional pluralism, which aligns models with population distributions. The paper also defines three types of pluralistic benchmarks: multi-objective benchmarks, trade-off steerable benchmarks, and jury-pluralistic benchmarks. It argues that current alignment techniques may be fundamentally limited for pluralistic AI, as they may reduce distributional pluralism in models. The paper advocates for further research on pluralistic alignment and proposes a plan for future work toward pluralistic evaluations and alignment. It discusses the relationship between current alignment approaches and pluralism, and provides initial findings that current alignment techniques reduce distributional pluralism. The paper also discusses the importance of pluralism in AI systems, arguing that AI systems should reflect human diversity and that pluralism can lead to more generalist systems. It outlines the technical benefits of pluralism, including increased interpretability and the ability to measure performance across a variety of objectives and users. The paper concludes that while current alignment techniques have made progress, new methodologies for measuring and aligning are needed to achieve pluralistic alignment.This paper proposes a roadmap for pluralistic alignment in AI systems, focusing on large language models (LLMs) as a testbed. It identifies three ways to define and operationalize pluralism in AI systems: Overton pluralism, which provides a spectrum of reasonable responses; steerable pluralism, which allows models to reflect certain perspectives; and distributional pluralism, which aligns models with population distributions. The paper also defines three types of pluralistic benchmarks: multi-objective benchmarks, trade-off steerable benchmarks, and jury-pluralistic benchmarks. It argues that current alignment techniques may be fundamentally limited for pluralistic AI, as they may reduce distributional pluralism in models. The paper advocates for further research on pluralistic alignment and proposes a plan for future work toward pluralistic evaluations and alignment. It discusses the relationship between current alignment approaches and pluralism, and provides initial findings that current alignment techniques reduce distributional pluralism. The paper also discusses the importance of pluralism in AI systems, arguing that AI systems should reflect human diversity and that pluralism can lead to more generalist systems. It outlines the technical benefits of pluralism, including increased interpretability and the ability to measure performance across a variety of objectives and users. The paper concludes that while current alignment techniques have made progress, new methodologies for measuring and aligning are needed to achieve pluralistic alignment.
Reach us at info@study.space