17 Jul 2024 | Youngjoong Kwon, Baole Fang, Yixing Lu, Haoye Dong, Cheng Zhang, Francisco Vicente Carrasco, Albert Mosella-Montoro, Jianjin Xu, Shingo Takagi, Daeil Kim, Aayush Prakash, and Fernando De la Torre
The paper "Generalizable Human Gaussians for Sparse View Synthesis" addresses the challenge of rendering photorealistic and accurate views of new human subjects from very sparse input views. The authors propose a method called Generalizable Human Gaussians (GHG) that leverages recent advancements in Gaussian Splatting and 3D human template models to achieve this goal. Key contributions include:
1. **Reformulation of 3D Gaussian Parameters**: The method reformulates the optimization of 3D Gaussian parameters into a regression process defined on the 2D UV space of a human template, allowing for the use of 2D convolutions and leveraging strong geometry priors.
2. **Multi-Scaffold Representation**: To improve the representation of complex human geometries, the method introduces a multi-scaffold approach, generating multiple offset meshes through dilating the human template mesh. This helps in capturing details such as clothing and hair that cannot be accurately represented by a single template mesh.
3. **Feed-Forward Architecture**: GHG is designed as a feed-forward architecture, eliminating the need for test-time optimization or fine-tuning, making it efficient and practical for real-world applications.
4. **Evaluation**: The method is evaluated on two datasets, THuman 2.0 and RenderPeople, demonstrating superior performance in both in-domain and cross-dataset generalization settings compared to existing methods.
The paper also discusses related work, including neural rendering techniques like Neural Radiance Fields (NeRF) and 3D Gaussian Splatting, and compares GHG with state-of-the-art methods in terms of visual quality and runtime. The authors highlight the limitations and future directions, emphasizing the potential societal impacts and ethical considerations of their work.The paper "Generalizable Human Gaussians for Sparse View Synthesis" addresses the challenge of rendering photorealistic and accurate views of new human subjects from very sparse input views. The authors propose a method called Generalizable Human Gaussians (GHG) that leverages recent advancements in Gaussian Splatting and 3D human template models to achieve this goal. Key contributions include:
1. **Reformulation of 3D Gaussian Parameters**: The method reformulates the optimization of 3D Gaussian parameters into a regression process defined on the 2D UV space of a human template, allowing for the use of 2D convolutions and leveraging strong geometry priors.
2. **Multi-Scaffold Representation**: To improve the representation of complex human geometries, the method introduces a multi-scaffold approach, generating multiple offset meshes through dilating the human template mesh. This helps in capturing details such as clothing and hair that cannot be accurately represented by a single template mesh.
3. **Feed-Forward Architecture**: GHG is designed as a feed-forward architecture, eliminating the need for test-time optimization or fine-tuning, making it efficient and practical for real-world applications.
4. **Evaluation**: The method is evaluated on two datasets, THuman 2.0 and RenderPeople, demonstrating superior performance in both in-domain and cross-dataset generalization settings compared to existing methods.
The paper also discusses related work, including neural rendering techniques like Neural Radiance Fields (NeRF) and 3D Gaussian Splatting, and compares GHG with state-of-the-art methods in terms of visual quality and runtime. The authors highlight the limitations and future directions, emphasizing the potential societal impacts and ethical considerations of their work.