[slides and audio] Wild-GS%3A Real-Time Novel View Synthesis from Unconstrained Photo Collections

**Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections** **Authors:** Jiacheng Xu **Abstract:** Photographs captured in unstructured tourist environments often exhibit variable appearances and transient occlusions, challenging accurate scene reconstruction and inducing artifacts in novel view synthesis. While prior approaches have integrated Neural Radiance Field (NeRF) with additional learnable modules to handle dynamic appearances and eliminate transient objects, their extensive training demands and slow rendering speeds limit practical deployments. Recently, 3D Gaussian Splatting (3DGS) has emerged as a promising alternative to NeRF, offering superior training and inference efficiency along with better rendering quality. This paper presents *Wild-GS*, an innovative adaptation of 3DGS optimized for unconstrained photo collections while preserving its efficiency benefits. *Wild-GS* determines the appearance of each 3D Gaussian by their inherent material attributes, global illumination, and camera properties per image, and point-level local variance of reflectance. Unlike previous methods that model reference features in image space, *Wild-GS* explicitly aligns the pixel appearance features to the corresponding local Gaussians by sampling the triplane extracted from the reference image. This novel design effectively transfers the high-frequency detailed appearance of the reference view to 3D space and significantly expedites the training process. Additionally, 2D visibility maps and depth regularization are leveraged to mitigate the transient effects and constrain the geometry, respectively. Extensive experiments demonstrate that *Wild-GS* achieves state-of-the-art rendering performance and the highest efficiency in both training and inference among all the existing techniques. **Contributions:** 1. We propose a hierarchical appearance decomposition strategy to handle complex appearance variances across different views. 2. We design an explicit local appearance modeling method to capture high-frequency appearance details. 3. Our model achieves the best rendering quality and the highest efficiency in training and inference. 4. Our model presents high-quality appearance transfer from arbitrary images. **Related Work:** - **3D Scene Representation:** Neural Radiance Field (NeRF) and its extensions have achieved groundbreaking synthesis quality on complex scenes but assume static geometry, material, and lighting conditions. - **3D Gaussian Splatting:** 3DGS represents the scene by millions of controllable 3D Gaussians, achieving high-quality and real-time rendering with competitive training efficiency. **Experimental Results:** - *Wild-GS* outperforms existing methods in rendering quality and efficiency, achieving around 3 PSNR increase and 200× shorter training time compared to state-of-the-art models. It also demonstrates superior performance in appearance transfer and tuning tasks. **Conclusion:** *Wild-GS* is an innovative adaptation of 3DGS optimized for unconstrained photo collections, achieving state-of-the-art rendering performance and the highest efficiency in training and inference.**Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections** **Authors:** Jiacheng Xu **Abstract:** Photographs captured in unstructured tourist environments often exhibit variable appearances and transient occlusions, challenging accurate scene reconstruction and inducing artifacts in novel view synthesis. While prior approaches have integrated Neural Radiance Field (NeRF) with additional learnable modules to handle dynamic appearances and eliminate transient objects, their extensive training demands and slow rendering speeds limit practical deployments. Recently, 3D Gaussian Splatting (3DGS) has emerged as a promising alternative to NeRF, offering superior training and inference efficiency along with better rendering quality. This paper presents *Wild-GS*, an innovative adaptation of 3DGS optimized for unconstrained photo collections while preserving its efficiency benefits. *Wild-GS* determines the appearance of each 3D Gaussian by their inherent material attributes, global illumination, and camera properties per image, and point-level local variance of reflectance. Unlike previous methods that model reference features in image space, *Wild-GS* explicitly aligns the pixel appearance features to the corresponding local Gaussians by sampling the triplane extracted from the reference image. This novel design effectively transfers the high-frequency detailed appearance of the reference view to 3D space and significantly expedites the training process. Additionally, 2D visibility maps and depth regularization are leveraged to mitigate the transient effects and constrain the geometry, respectively. Extensive experiments demonstrate that *Wild-GS* achieves state-of-the-art rendering performance and the highest efficiency in both training and inference among all the existing techniques. **Contributions:** 1. We propose a hierarchical appearance decomposition strategy to handle complex appearance variances across different views. 2. We design an explicit local appearance modeling method to capture high-frequency appearance details. 3. Our model achieves the best rendering quality and the highest efficiency in training and inference. 4. Our model presents high-quality appearance transfer from arbitrary images. **Related Work:** - **3D Scene Representation:** Neural Radiance Field (NeRF) and its extensions have achieved groundbreaking synthesis quality on complex scenes but assume static geometry, material, and lighting conditions. - **3D Gaussian Splatting:** 3DGS represents the scene by millions of controllable 3D Gaussians, achieving high-quality and real-time rendering with competitive training efficiency. **Experimental Results:** - *Wild-GS* outperforms existing methods in rendering quality and efficiency, achieving around 3 PSNR increase and 200× shorter training time compared to state-of-the-art models. It also demonstrates superior performance in appearance transfer and tuning tasks. **Conclusion:** *Wild-GS* is an innovative adaptation of 3DGS optimized for unconstrained photo collections, achieving state-of-the-art rendering performance and the highest efficiency in training and inference.

Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections

14 Jun 2024 | Jiacong Xu, Yiqun Mei, Vishal M. Patel