RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models

RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models

11 Jul 2024 | Bowen Zhang, Yiji Cheng, Chunyu Wang, Ting Zhang, Jiaolong Yang, Yansong Tang, Feng Zhao, Dong Chen, and Baining Guo
RodinHD is a novel method for generating high-fidelity 3D avatars from single portrait images. It addresses the challenge of capturing intricate details such as hairstyles by introducing a task-replay strategy and an identity-aware weight consolidation regularizer to improve the decoder's capability of rendering sharp details. The method also optimizes the guiding effect of the portrait image by computing a finer-grained hierarchical representation and injecting it into the 3D diffusion model at multiple layers via cross-attention. The model is trained on 46K avatars with an optimized noise schedule for triplanes, resulting in 3D avatars with better details and improved cross-view consistency compared to previous methods. The proposed techniques are general and can be applied to other 3D generation tasks.RodinHD is a novel method for generating high-fidelity 3D avatars from single portrait images. It addresses the challenge of capturing intricate details such as hairstyles by introducing a task-replay strategy and an identity-aware weight consolidation regularizer to improve the decoder's capability of rendering sharp details. The method also optimizes the guiding effect of the portrait image by computing a finer-grained hierarchical representation and injecting it into the 3D diffusion model at multiple layers via cross-attention. The model is trained on 46K avatars with an optimized noise schedule for triplanes, resulting in 3D avatars with better details and improved cross-view consistency compared to previous methods. The proposed techniques are general and can be applied to other 3D generation tasks.
Reach us at info@study.space
Understanding RodinHD%3A High-Fidelity 3D Avatar Generation with Diffusion Models