SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting

SplattingAvatar: Realistic Real-Time Human Avatars with Mesh-Embedded Gaussian Splatting

8 Mar 2024 | Zhijing Shao, Zhaolong Wang, Zhuang Li, Duotun Wang, Xiangru Lin, Yu Zhang, Mingming Fan, Zeyu Wang
SplattingAvatar is a hybrid 3D representation of photorealistic human avatars that integrates Gaussian Splatting with a triangle mesh. The method takes monocular video input and employs a trainable embedding technique to associate Gaussians with the mesh. The Gaussians are defined by barycentric coordinates and displacement on the mesh, enabling high-fidelity rendering of human avatars. The method achieves real-time rendering capabilities in Unity, with over 300 FPS on an NVIDIA RTX 3090 GPU and 30 FPS on an iPhone 13. The hybrid representation combines the mesh for low-frequency motion and surface deformation with Gaussians for high-frequency geometry and detailed appearance. The Gaussians are explicitly controlled by the mesh, allowing compatibility with various animation techniques. The method is trained from monocular videos for both full-body and head avatars, achieving state-of-the-art rendering quality across multiple datasets. The method disentangles motion and appearance, enabling efficient and accurate reconstruction of human avatars. The method is compared with existing hybrid models, demonstrating superior performance in rendering quality and adaptability. The method is implemented in Unity, achieving high performance on mobile devices. The method is evaluated on various datasets, including head and full-body avatars, showing significant improvements in rendering quality compared to state-of-the-art methods. The method is also evaluated on the PeopleSnapshot dataset, demonstrating generalizability to novel poses. The method is ablated to show the importance of trainable embeddings and scaling regularization in achieving high-quality rendering. The method is discussed in terms of its efficiency, compatibility, and portability, with a focus on the driving mesh. The method is limited by its dependence on the motion representation ability of the driving mesh, but has potential for future work in disentangled mesh representations for human avatars.SplattingAvatar is a hybrid 3D representation of photorealistic human avatars that integrates Gaussian Splatting with a triangle mesh. The method takes monocular video input and employs a trainable embedding technique to associate Gaussians with the mesh. The Gaussians are defined by barycentric coordinates and displacement on the mesh, enabling high-fidelity rendering of human avatars. The method achieves real-time rendering capabilities in Unity, with over 300 FPS on an NVIDIA RTX 3090 GPU and 30 FPS on an iPhone 13. The hybrid representation combines the mesh for low-frequency motion and surface deformation with Gaussians for high-frequency geometry and detailed appearance. The Gaussians are explicitly controlled by the mesh, allowing compatibility with various animation techniques. The method is trained from monocular videos for both full-body and head avatars, achieving state-of-the-art rendering quality across multiple datasets. The method disentangles motion and appearance, enabling efficient and accurate reconstruction of human avatars. The method is compared with existing hybrid models, demonstrating superior performance in rendering quality and adaptability. The method is implemented in Unity, achieving high performance on mobile devices. The method is evaluated on various datasets, including head and full-body avatars, showing significant improvements in rendering quality compared to state-of-the-art methods. The method is also evaluated on the PeopleSnapshot dataset, demonstrating generalizability to novel poses. The method is ablated to show the importance of trainable embeddings and scaling regularization in achieving high-quality rendering. The method is discussed in terms of its efficiency, compatibility, and portability, with a focus on the driving mesh. The method is limited by its dependence on the motion representation ability of the driving mesh, but has potential for future work in disentangled mesh representations for human avatars.
Reach us at info@study.space
Understanding SplattingAvatar%3A Realistic Real-Time Human Avatars With Mesh-Embedded Gaussian Splatting