DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing

DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing

22 Jul 2024 | Minghao Chen, Iro Laina, and Andrea Vedaldi
DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing This paper introduces DGE, a method for directly editing 3D objects and scenes based on open-ended language instructions. Unlike previous approaches that rely on iterative updates of 3D representations such as neural radiance fields, DGE addresses the inefficiency and inconsistency of such methods by directly optimizing a 3D Gaussian Splatting (GS) representation. The key idea is to first ensure multi-view consistency in the image space, which allows for efficient and accurate 3D editing. DGE consists of two main stages: (1) multi-view consistent editing with epipolar constraints, and (2) direct 3D reconstruction from edited images. In the first stage, multiple views of the 3D scene are rendered and edited to ensure consistency across views. This is achieved by using spatio-temporal attention and epipolar constraints to align features across views. In the second stage, the edited images are used to directly optimize the 3D Gaussian Splatting representation, which is significantly faster and more efficient than previous methods. The method is evaluated on several benchmark datasets and compared with existing approaches such as GaussianEditor and InstructNeRF2NeRF. Results show that DGE achieves higher fidelity and efficiency, with a significant reduction in editing time. Additionally, DGE supports selective editing of specific parts of the scene, making it more flexible and user-friendly. The paper also discusses the advantages of using 3D Gaussian Splatting over other 3D representations, including its efficiency in rendering and gradient computation, and its ability to support local edits. The method is particularly effective in scenarios where multi-view consistency is crucial, such as in video editing and 3D scene reconstruction. Overall, DGE provides a more efficient and accurate approach to 3D editing, with the ability to handle complex scenes and provide high-quality results in a fraction of the time required by previous methods.DGE: Direct Gaussian 3D Editing by Consistent Multi-view Editing This paper introduces DGE, a method for directly editing 3D objects and scenes based on open-ended language instructions. Unlike previous approaches that rely on iterative updates of 3D representations such as neural radiance fields, DGE addresses the inefficiency and inconsistency of such methods by directly optimizing a 3D Gaussian Splatting (GS) representation. The key idea is to first ensure multi-view consistency in the image space, which allows for efficient and accurate 3D editing. DGE consists of two main stages: (1) multi-view consistent editing with epipolar constraints, and (2) direct 3D reconstruction from edited images. In the first stage, multiple views of the 3D scene are rendered and edited to ensure consistency across views. This is achieved by using spatio-temporal attention and epipolar constraints to align features across views. In the second stage, the edited images are used to directly optimize the 3D Gaussian Splatting representation, which is significantly faster and more efficient than previous methods. The method is evaluated on several benchmark datasets and compared with existing approaches such as GaussianEditor and InstructNeRF2NeRF. Results show that DGE achieves higher fidelity and efficiency, with a significant reduction in editing time. Additionally, DGE supports selective editing of specific parts of the scene, making it more flexible and user-friendly. The paper also discusses the advantages of using 3D Gaussian Splatting over other 3D representations, including its efficiency in rendering and gradient computation, and its ability to support local edits. The method is particularly effective in scenarios where multi-view consistency is crucial, such as in video editing and 3D scene reconstruction. Overall, DGE provides a more efficient and accurate approach to 3D editing, with the ability to handle complex scenes and provide high-quality results in a fraction of the time required by previous methods.
Reach us at info@study.space
Understanding DGE%3A Direct Gaussian 3D Editing by Consistent Multi-view Editing