9 Dec 2021 | Alex Yu*, Sara Fridovich-Keil*, Matthew Tancik, Qinhong Chen, Benjamin Recht, Angjoo Kanazawa
Plenoxels is a novel system for photorealistic view synthesis that represents scenes as sparse 3D grids with spherical harmonics. Unlike Neural Radiance Fields (NeRF), Plenoxels can be optimized from calibrated images using gradient methods and regularization without neural components. On standard benchmark tasks, Plenoxels achieve two orders of magnitude faster optimization compared to NeRF without sacrificing visual quality. The method uses a sparse voxel grid where each voxel stores opacity and spherical harmonic coefficients, which are interpolated to model the full plenoptic function continuously. The optimization process involves coarse-to-fine refinement, pruning unnecessary voxels, and using a total variation regularizer to ensure smoothness. Plenoxels can handle both bounded and unbounded scenes, including 360° scenes, and can be optimized on a single GPU in 11 minutes for bounded scenes and 27 minutes for unbounded scenes. The method is demonstrated to produce high-quality results with interactive rendering speeds.Plenoxels is a novel system for photorealistic view synthesis that represents scenes as sparse 3D grids with spherical harmonics. Unlike Neural Radiance Fields (NeRF), Plenoxels can be optimized from calibrated images using gradient methods and regularization without neural components. On standard benchmark tasks, Plenoxels achieve two orders of magnitude faster optimization compared to NeRF without sacrificing visual quality. The method uses a sparse voxel grid where each voxel stores opacity and spherical harmonic coefficients, which are interpolated to model the full plenoptic function continuously. The optimization process involves coarse-to-fine refinement, pruning unnecessary voxels, and using a total variation regularizer to ensure smoothness. Plenoxels can handle both bounded and unbounded scenes, including 360° scenes, and can be optimized on a single GPU in 11 minutes for bounded scenes and 27 minutes for unbounded scenes. The method is demonstrated to produce high-quality results with interactive rendering speeds.