29 Nov 2024 | Jiahui Lei1 Yijia Weng2 Adam W. Harley2 Leonidas Guibas2 Kostas Daniilidis1,3
The paper introduces 4D Motion Scaffolds (MoSca), a novel 4D reconstruction system designed to reconstruct and synthesize novel views of dynamic scenes from monocular videos captured in casual settings. MoSca leverages prior knowledge from foundational vision models and encodes the underlying motions/deformations into a compact and smooth representation called a Motion Scaffold. This representation disentangles the scene geometry and appearance, which are then globally fused using Gaussian Splatting. The system also estimates camera poses and focal lengths using bundle adjustment. Experiments demonstrate state-of-the-art performance on dynamic rendering benchmarks and real videos, showcasing the effectiveness of MoSca in various scenarios. Key contributions include an automated 4D reconstruction system, a novel deformation representation, and efficient Gaussian-based dynamic scene representation.The paper introduces 4D Motion Scaffolds (MoSca), a novel 4D reconstruction system designed to reconstruct and synthesize novel views of dynamic scenes from monocular videos captured in casual settings. MoSca leverages prior knowledge from foundational vision models and encodes the underlying motions/deformations into a compact and smooth representation called a Motion Scaffold. This representation disentangles the scene geometry and appearance, which are then globally fused using Gaussian Splatting. The system also estimates camera poses and focal lengths using bundle adjustment. Experiments demonstrate state-of-the-art performance on dynamic rendering benchmarks and real videos, showcasing the effectiveness of MoSca in various scenarios. Key contributions include an automated 4D reconstruction system, a novel deformation representation, and efficient Gaussian-based dynamic scene representation.