17 Feb 2025 | Yuxuan Wang, Xuanyu Yi, Zike Wu, Na Zhao, Long Chen, and Hanwang Zhang
The paper introduces View-Consistent Editing (VcEDIT), a novel framework for 3D Gaussian Splatting (3DGS) editing that addresses the issue of multi-view inconsistency in image-guided 3D editing. VcEDIT seamlessly integrates 3DGS into the image editing process, ensuring multi-view consistency in the guidance images, thereby mitigating mode collapse and improving editing quality.
Key contributions of VcEDIT include:
1. **Consistency Modules**: Two innovative modules, the Cross-attention Consistency Module (CCM) and the Editing Consistency Module (ECM), are designed to reduce multi-view inconsistencies.
- **CCM**: Consolidates multi-view cross-attention maps to harmonize the model's attention across views.
- **ECM**: Calibrates multi-view inconsistent editing outputs by fine-tuning a source-cloned 3DGS model and rendering it back to images.
2. **Iterative Pattern**: An iterative pattern is introduced to continuously refine the 3DGS and image guidance, ensuring consistent and high-quality edits.
Experiments demonstrate that VcEDIT outperforms existing methods in various real-world scenes, including face, object, and large-scale scene editing. The framework effectively handles intricate transformations and large-scale adjustments, providing superior editing results with minimal artifacts.
The paper also includes ablation studies to validate the effectiveness of the consistency modules and an analysis of the iterative pattern's impact on editing quality. Overall, VcEDIT sets a new standard for text-driven 3D model editing, offering a robust solution to multi-view inconsistency in 3DGS editing.The paper introduces View-Consistent Editing (VcEDIT), a novel framework for 3D Gaussian Splatting (3DGS) editing that addresses the issue of multi-view inconsistency in image-guided 3D editing. VcEDIT seamlessly integrates 3DGS into the image editing process, ensuring multi-view consistency in the guidance images, thereby mitigating mode collapse and improving editing quality.
Key contributions of VcEDIT include:
1. **Consistency Modules**: Two innovative modules, the Cross-attention Consistency Module (CCM) and the Editing Consistency Module (ECM), are designed to reduce multi-view inconsistencies.
- **CCM**: Consolidates multi-view cross-attention maps to harmonize the model's attention across views.
- **ECM**: Calibrates multi-view inconsistent editing outputs by fine-tuning a source-cloned 3DGS model and rendering it back to images.
2. **Iterative Pattern**: An iterative pattern is introduced to continuously refine the 3DGS and image guidance, ensuring consistent and high-quality edits.
Experiments demonstrate that VcEDIT outperforms existing methods in various real-world scenes, including face, object, and large-scale scene editing. The framework effectively handles intricate transformations and large-scale adjustments, providing superior editing results with minimal artifacts.
The paper also includes ablation studies to validate the effectiveness of the consistency modules and an analysis of the iterative pattern's impact on editing quality. Overall, VcEDIT sets a new standard for text-driven 3D model editing, offering a robust solution to multi-view inconsistency in 3DGS editing.