EndoGS: Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting
**Abstract:**
Surgical 3D reconstruction is a critical area in robotic surgery, with recent methods using dynamic radiance fields to achieve success in 3D reconstruction of deformable tissues from single-viewpoint videos. However, these methods often suffer from time-consuming optimization or inferior quality. Inspired by 3D Gaussian Splatting, a recent trending 3D representation, we present EndoGS, which applies Gaussian Splatting for deformable endoscopic tissue reconstruction. Our approach incorporates deformation fields to handle dynamic scenes, depth-guided supervision with spatial-temporal weight masks to optimize 3D targets with tool occlusion from a single viewpoint, and surface-aligned regularization terms to capture better geometry. Experiments on DaVinci robotic surgery videos demonstrate that EndoGS achieves superior rendering quality.
**Keywords:**
Gaussian Splatting · Robotic Surgery · 3D Reconstruction
**Introduction:**
Reconstructing high-quality deformable tissues from endoscopic videos is challenging, facilitating downstream tasks like surgical AR, education, and robot learning. Earlier methods using depth estimation struggle with non-rigid deformations and tool occlusion. Neural Radiance Fields (NeRFs) have shown great potential, but they are prone to failure in dynamic scenes. 3D Gaussian Splatting (3D-GS) offers a powerful representation for real-time novel view synthesis, and our method, EndoGS, leverages this to reconstruct 3D surgical scenes from single-viewpoint videos, estimated depth maps, and labeled tool masks.
**Method:**
EndoGS uses a deformable variant of 3D-GS to reconstruct 3D surgical scenes. It combines static Gaussians and deformable parameters in the time dimension, employs depth-guided supervision with spatial-temporal weight masks, and includes surface-aligned regularization terms. The method is trained using labeled tool masks and estimated depth maps, and it uses differentiable rasterization to obtain rendered images and depth maps.
**Experiments:**
We evaluate EndoGS on DaVinci robotic surgery videos, comparing it with competitive methods. Results show that EndoGS outperforms other methods in rendering quality and speed. Ablation studies demonstrate the effectiveness of spatial Total Variation loss and Surface-Aligned regularization.
**Conclusion:**
EndoGS achieves high-quality, real-time reconstruction of deformable endoscopic tissues from single-viewpoint videos, estimated depth maps, and labeled tool masks. However, limitations in 3D reconstruction from single-viewpoint videos remain, and future work should focus on practical endoscopic reconstruction and more surgical cameras to facilitate realistic downstream tasks.EndoGS: Deformable Endoscopic Tissues Reconstruction with Gaussian Splatting
**Abstract:**
Surgical 3D reconstruction is a critical area in robotic surgery, with recent methods using dynamic radiance fields to achieve success in 3D reconstruction of deformable tissues from single-viewpoint videos. However, these methods often suffer from time-consuming optimization or inferior quality. Inspired by 3D Gaussian Splatting, a recent trending 3D representation, we present EndoGS, which applies Gaussian Splatting for deformable endoscopic tissue reconstruction. Our approach incorporates deformation fields to handle dynamic scenes, depth-guided supervision with spatial-temporal weight masks to optimize 3D targets with tool occlusion from a single viewpoint, and surface-aligned regularization terms to capture better geometry. Experiments on DaVinci robotic surgery videos demonstrate that EndoGS achieves superior rendering quality.
**Keywords:**
Gaussian Splatting · Robotic Surgery · 3D Reconstruction
**Introduction:**
Reconstructing high-quality deformable tissues from endoscopic videos is challenging, facilitating downstream tasks like surgical AR, education, and robot learning. Earlier methods using depth estimation struggle with non-rigid deformations and tool occlusion. Neural Radiance Fields (NeRFs) have shown great potential, but they are prone to failure in dynamic scenes. 3D Gaussian Splatting (3D-GS) offers a powerful representation for real-time novel view synthesis, and our method, EndoGS, leverages this to reconstruct 3D surgical scenes from single-viewpoint videos, estimated depth maps, and labeled tool masks.
**Method:**
EndoGS uses a deformable variant of 3D-GS to reconstruct 3D surgical scenes. It combines static Gaussians and deformable parameters in the time dimension, employs depth-guided supervision with spatial-temporal weight masks, and includes surface-aligned regularization terms. The method is trained using labeled tool masks and estimated depth maps, and it uses differentiable rasterization to obtain rendered images and depth maps.
**Experiments:**
We evaluate EndoGS on DaVinci robotic surgery videos, comparing it with competitive methods. Results show that EndoGS outperforms other methods in rendering quality and speed. Ablation studies demonstrate the effectiveness of spatial Total Variation loss and Surface-Aligned regularization.
**Conclusion:**
EndoGS achieves high-quality, real-time reconstruction of deformable endoscopic tissues from single-viewpoint videos, estimated depth maps, and labeled tool masks. However, limitations in 3D reconstruction from single-viewpoint videos remain, and future work should focus on practical endoscopic reconstruction and more surgical cameras to facilitate realistic downstream tasks.