Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback

Social Choice Should Guide AI Alignment in Dealing with Diverse Human Feedback

2024 | Vincent Conitzer, Rachel Freedman, Jobst Heitzig, Wesley H. Holliday, Bob M. Jacobs, Nathan Lambert, Milan Mossé, Eric Pacuit, Stuart Russell, Hailey Schoelkopf, Emanuel Tewolde, William S. Zwickler
This paper argues that social choice theory can help address challenges in aligning AI systems with human values, particularly in handling diverse human feedback. Current methods like reinforcement learning from human feedback (RLHF) and constitutional AI face limitations, including unrepresentative data, unrealistic models of human decision-making, and insufficient modeling of human diversity. Social choice theory provides tools to aggregate preferences, ensuring consistency and fairness in AI alignment. The paper discusses how social choice methods can be applied to determine which humans should provide input, what type of feedback to collect, and how to aggregate and use it. It highlights the importance of addressing issues like strategic voting, independence of clones, and anonymity in AI alignment. The paper also explores how diverse human feedback can be processed and aggregated, and how social choice theory can help avoid inconsistencies in collective decisions. It emphasizes the need for further research to apply social choice theory to AI alignment, ensuring that AI systems are fair, transparent, and aligned with societal values. The paper concludes that social choice theory is well-suited to address the challenges of aligning AI systems with human values, and that further collaboration between social choice and AI ethics researchers is essential.This paper argues that social choice theory can help address challenges in aligning AI systems with human values, particularly in handling diverse human feedback. Current methods like reinforcement learning from human feedback (RLHF) and constitutional AI face limitations, including unrepresentative data, unrealistic models of human decision-making, and insufficient modeling of human diversity. Social choice theory provides tools to aggregate preferences, ensuring consistency and fairness in AI alignment. The paper discusses how social choice methods can be applied to determine which humans should provide input, what type of feedback to collect, and how to aggregate and use it. It highlights the importance of addressing issues like strategic voting, independence of clones, and anonymity in AI alignment. The paper also explores how diverse human feedback can be processed and aggregated, and how social choice theory can help avoid inconsistencies in collective decisions. It emphasizes the need for further research to apply social choice theory to AI alignment, ensuring that AI systems are fair, transparent, and aligned with societal values. The paper concludes that social choice theory is well-suited to address the challenges of aligning AI systems with human values, and that further collaboration between social choice and AI ethics researchers is essential.
Reach us at info@study.space