Generalizable Human Gaussians for Sparse View Synthesis

Generalizable Human Gaussians for Sparse View Synthesis

17 Jul 2024 | Youngjoong Kwon, Baole Fang, Yixing Lu, Haoye Dong, Cheng Zhang, Francisco Vicente Carrasco, Albert Mosella-Montoro, Jianjin Xu, Shingo Takagi, Daeil Kim, Aayush Prakash, Fernando De la Torre
Generalizable Human Gaussians (GHG) is a novel method for rendering novel views of humans from sparse input views without requiring test-time optimization. The method leverages 3D Gaussian Splatting and reformulates the learning of 3D Gaussian parameters into a regression task on the 2D UV space of a human template. This allows the use of strong geometry priors and 2D convolutions for more accurate and photorealistic rendering. A key innovation is the use of a multi-scaffold representation to effectively encode geometric details, bridging the gap between the template model and real human geometry. The method outperforms existing approaches in both within-dataset and cross-dataset generalization settings. Experiments on the THuman 2.0 and RenderPeople datasets show that GHG achieves competitive results in terms of perceptual metrics like LPIPS and FID, and outperforms other methods in terms of rendering quality and accuracy. The method is trained and tested with 3 input views, and is capable of generating high-quality renderings of new humans from sparse input views. The approach is efficient and can be integrated with 2D-based inpainting modules to hallucinate unobserved regions. The results demonstrate that GHG is a promising method for generalizable human rendering from sparse views.Generalizable Human Gaussians (GHG) is a novel method for rendering novel views of humans from sparse input views without requiring test-time optimization. The method leverages 3D Gaussian Splatting and reformulates the learning of 3D Gaussian parameters into a regression task on the 2D UV space of a human template. This allows the use of strong geometry priors and 2D convolutions for more accurate and photorealistic rendering. A key innovation is the use of a multi-scaffold representation to effectively encode geometric details, bridging the gap between the template model and real human geometry. The method outperforms existing approaches in both within-dataset and cross-dataset generalization settings. Experiments on the THuman 2.0 and RenderPeople datasets show that GHG achieves competitive results in terms of perceptual metrics like LPIPS and FID, and outperforms other methods in terms of rendering quality and accuracy. The method is trained and tested with 3 input views, and is capable of generating high-quality renderings of new humans from sparse input views. The approach is efficient and can be integrated with 2D-based inpainting modules to hallucinate unobserved regions. The results demonstrate that GHG is a promising method for generalizable human rendering from sparse views.
Reach us at info@study.space
[slides] Generalizable Human Gaussians for Sparse View Synthesis | StudySpace