GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh

GoMAvatar: Efficient Animatable Human Modeling from Monocular Video Using Gaussians-on-Mesh

11 Apr 2024 | Jing Wen, Xiaoming Zhao, Zhongzheng Ren, Alexander G. Schwing, Shenlong Wang
GoMAavatar is a novel approach for real-time, memory-efficient, high-quality animatable human modeling. It takes a single monocular video as input to create a digital avatar that can be re-articulated in new poses and rendered from novel viewpoints, while seamlessly integrating with graphics pipelines. The core of GoMAavatar is the Gaussians-on-Mesh (GoM) representation, which combines the benefits of Gaussian splatting and deformable meshes. GoM uses Gaussian splats for rendering, offering flexibility in modeling rich appearances and enabling real-time performance, and a skeleton-driven deformable mesh for effective articulation and compact geometry. The method factors the final RGB color into a pseudo albedo map and a pseudo shading map, addressing view dependency. Extensive experiments on datasets like ZJU-MoCap, PeopleSnapshot, and YouTube videos show that GoMAavatar matches or surpasses current monocular human modeling algorithms in rendering quality and significantly outperforms them in computational efficiency (43 FPS) and memory efficiency (3.63 MB per subject).GoMAavatar is a novel approach for real-time, memory-efficient, high-quality animatable human modeling. It takes a single monocular video as input to create a digital avatar that can be re-articulated in new poses and rendered from novel viewpoints, while seamlessly integrating with graphics pipelines. The core of GoMAavatar is the Gaussians-on-Mesh (GoM) representation, which combines the benefits of Gaussian splatting and deformable meshes. GoM uses Gaussian splats for rendering, offering flexibility in modeling rich appearances and enabling real-time performance, and a skeleton-driven deformable mesh for effective articulation and compact geometry. The method factors the final RGB color into a pseudo albedo map and a pseudo shading map, addressing view dependency. Extensive experiments on datasets like ZJU-MoCap, PeopleSnapshot, and YouTube videos show that GoMAavatar matches or surpasses current monocular human modeling algorithms in rendering quality and significantly outperforms them in computational efficiency (43 FPS) and memory efficiency (3.63 MB per subject).
Reach us at info@study.space