DreamReward: Text-to-3D Generation with Human Preference

DreamReward: Text-to-3D Generation with Human Preference

21 Mar 2024 | JunLiang Ye, Fangfu Liu, Qixiu Li, Zhengyi Wang, Yikai Wang, Xinzhou Wang, Yueqi Duan, Jun Zhu
This paper introduces DreamReward, a novel text-to-3D generation framework that aligns with human preferences through human feedback. The framework includes a comprehensive dataset of 2530 prompts and 25,304 corresponding 3D assets, collected through a systematic annotation process. A reward model, Reward3D, is trained to encode human preferences and evaluate the quality of generated 3D content. Based on Reward3D, the DreamFL algorithm is proposed to optimize multi-view diffusion models using a redefined scorer, enhancing the alignment of generated 3D assets with human preferences. The paper also presents extensive experiments comparing DreamReward with five baseline 3D generation models, demonstrating that DreamReward consistently outperforms them in terms of text-asset alignment, 3D plausibility, texture details, geometry details, texture-geometry coherency, and overall quality. Additionally, user studies show that DreamReward is preferred by raters on average, with 65% of participants finding it more aligned with human preferences. The framework is evaluated using multiple metrics, including CLIP, GPTEval3D, and ImageReward, and is shown to align well with human preferences. The results indicate that Reward3D can serve as a reliable evaluation metric for text-to-3D generation, offering a lightweight and efficient alternative to human evaluation. The paper also discusses the limitations of the current approach, including the need for larger annotated datasets to improve diversity, and outlines future work to address these challenges.This paper introduces DreamReward, a novel text-to-3D generation framework that aligns with human preferences through human feedback. The framework includes a comprehensive dataset of 2530 prompts and 25,304 corresponding 3D assets, collected through a systematic annotation process. A reward model, Reward3D, is trained to encode human preferences and evaluate the quality of generated 3D content. Based on Reward3D, the DreamFL algorithm is proposed to optimize multi-view diffusion models using a redefined scorer, enhancing the alignment of generated 3D assets with human preferences. The paper also presents extensive experiments comparing DreamReward with five baseline 3D generation models, demonstrating that DreamReward consistently outperforms them in terms of text-asset alignment, 3D plausibility, texture details, geometry details, texture-geometry coherency, and overall quality. Additionally, user studies show that DreamReward is preferred by raters on average, with 65% of participants finding it more aligned with human preferences. The framework is evaluated using multiple metrics, including CLIP, GPTEval3D, and ImageReward, and is shown to align well with human preferences. The results indicate that Reward3D can serve as a reliable evaluation metric for text-to-3D generation, offering a lightweight and efficient alternative to human evaluation. The paper also discusses the limitations of the current approach, including the need for larger annotated datasets to improve diversity, and outlines future work to address these challenges.
Reach us at info@study.space