23 May 2024 | Yifan Zhong, Chengdong Ma, Xiaoyuan Zhang, Ziran Yang, Haojun Chen, Qingfu Zhang, Siyuan Qi, Yaodong Yang
Panacea is an innovative approach for aligning large language models (LLMs) with multi-dimensional human preferences. It reframes alignment as a multi-dimensional preference optimization problem, enabling a single model to adapt online and Pareto-optimally to diverse sets of preferences without further tuning. The key challenge is using a low-dimensional preference vector to guide the model's behavior, which Panacea addresses through SVD-based low-rank adaptation. Theoretically, Panacea recovers the entire Pareto front under mild conditions, and experiments show its effectiveness in aligning a single LLM to represent an exponentially vast spectrum of human preferences. Panacea outperforms baseline methods, producing superior, uniformly distributed, and convex fronts. It allows online specification of preference vectors to swiftly adapt to any human preferences, making it a suitable solution for multi-dimensional preference optimization. Panacea is the first fundamentally Pareto set learning approach for multi-dimensional preference alignment, offering computational efficiency and scalability. It enables fine-grained control of model behavior through preference embedding, making it effective for aligning with diverse and complex human preferences. The results demonstrate that Panacea consistently learns superior solution sets that align better with diverse human preferences, showing no performance saturation even on high-dimensional problems. This work addresses the limitations of scalar-label, single-objective alignment paradigms and provides a robust solution for aligning models to diverse human preferences.Panacea is an innovative approach for aligning large language models (LLMs) with multi-dimensional human preferences. It reframes alignment as a multi-dimensional preference optimization problem, enabling a single model to adapt online and Pareto-optimally to diverse sets of preferences without further tuning. The key challenge is using a low-dimensional preference vector to guide the model's behavior, which Panacea addresses through SVD-based low-rank adaptation. Theoretically, Panacea recovers the entire Pareto front under mild conditions, and experiments show its effectiveness in aligning a single LLM to represent an exponentially vast spectrum of human preferences. Panacea outperforms baseline methods, producing superior, uniformly distributed, and convex fronts. It allows online specification of preference vectors to swiftly adapt to any human preferences, making it a suitable solution for multi-dimensional preference optimization. Panacea is the first fundamentally Pareto set learning approach for multi-dimensional preference alignment, offering computational efficiency and scalability. It enables fine-grained control of model behavior through preference embedding, making it effective for aligning with diverse and complex human preferences. The results demonstrate that Panacea consistently learns superior solution sets that align better with diverse human preferences, showing no performance saturation even on high-dimensional problems. This work addresses the limitations of scalar-label, single-objective alignment paradigms and provides a robust solution for aligning models to diverse human preferences.