A Construct-Optimize Approach to Sparse View Synthesis without Camera Pose

A Construct-Optimize Approach to Sparse View Synthesis without Camera Pose

July 27-August 1, 2024 | Kaiwen Jiang, Yang Fu, Mukund Varma T, Yash Belhe, Xiaolong Wang, Hao Su, Ravi Ramamoorthi
This paper introduces a construct-and-optimize approach for sparse view synthesis without camera poses. The method leverages 3D Gaussian splatting to build a solution progressively using monocular depth and project pixels back into the 3D world. During construction, the solution is optimized by detecting 2D correspondences between training views and rendered images. A unified differentiable pipeline is developed for camera registration and adjustment of both camera poses and depths, followed by backprojection. A novel concept of an expected surface in Gaussian splatting is introduced, which is critical to the optimization process. These steps enable a coarse solution that can be refined using standard optimization methods. The method is tested on the Tanks and Temples and Static Hikes datasets with as few as three widely-spaced views, achieving significantly better quality than competing methods, including those with approximate camera pose information. The results improve with more views and outperform previous InstantNGP and Gaussian Splatting algorithms even when using half the dataset. The method is robust to sparse views and avoids the need for pre-estimated camera poses, making it suitable for scenarios where camera poses are unknown or inaccurate. The approach is validated through extensive experiments and comparisons, demonstrating its effectiveness in sparse view synthesis.This paper introduces a construct-and-optimize approach for sparse view synthesis without camera poses. The method leverages 3D Gaussian splatting to build a solution progressively using monocular depth and project pixels back into the 3D world. During construction, the solution is optimized by detecting 2D correspondences between training views and rendered images. A unified differentiable pipeline is developed for camera registration and adjustment of both camera poses and depths, followed by backprojection. A novel concept of an expected surface in Gaussian splatting is introduced, which is critical to the optimization process. These steps enable a coarse solution that can be refined using standard optimization methods. The method is tested on the Tanks and Temples and Static Hikes datasets with as few as three widely-spaced views, achieving significantly better quality than competing methods, including those with approximate camera pose information. The results improve with more views and outperform previous InstantNGP and Gaussian Splatting algorithms even when using half the dataset. The method is robust to sparse views and avoids the need for pre-estimated camera poses, making it suitable for scenarios where camera poses are unknown or inaccurate. The approach is validated through extensive experiments and comparisons, demonstrating its effectiveness in sparse view synthesis.
Reach us at info@study.space