ManiGaussian is a novel method for multi-task robotic manipulation that leverages dynamic Gaussian Splatting to capture scene-level spatiotemporal dynamics. The method aims to improve the performance of robots in unstructured environments by learning the semantics propagation in the Gaussian embedding space. Specifically, it formulates a dynamic Gaussian Splatting framework that infers semantic features in the Gaussian embedding space, which are then used to predict optimal robot actions. A Gaussian world model parameterizes the distributions in this framework, providing informative supervision by reconstructing future scenes based on current observations and robot actions. The method is evaluated on 10 RLBench tasks with 166 variations, achieving an average success rate 13.1% higher than state-of-the-art methods. The contributions of ManiGaussian include:
1. **Dynamic Gaussian Splatting Framework**: Models the propagation of semantic features in the Gaussian embedding space to capture scene-level spatiotemporal dynamics.
2. **Gaussian World Model**: Parameterizes distributions in the dynamic Gaussian Splatting framework, providing informative supervision through future scene reconstruction.
3. **Performance Evaluation**: Outperforms state-of-the-art methods by 13.1% in average success rate on 10 RLBench tasks.
The method addresses the limitations of conventional methods by explicitly encoding scene dynamics, leading to more accurate action predictions and higher task success rates.ManiGaussian is a novel method for multi-task robotic manipulation that leverages dynamic Gaussian Splatting to capture scene-level spatiotemporal dynamics. The method aims to improve the performance of robots in unstructured environments by learning the semantics propagation in the Gaussian embedding space. Specifically, it formulates a dynamic Gaussian Splatting framework that infers semantic features in the Gaussian embedding space, which are then used to predict optimal robot actions. A Gaussian world model parameterizes the distributions in this framework, providing informative supervision by reconstructing future scenes based on current observations and robot actions. The method is evaluated on 10 RLBench tasks with 166 variations, achieving an average success rate 13.1% higher than state-of-the-art methods. The contributions of ManiGaussian include:
1. **Dynamic Gaussian Splatting Framework**: Models the propagation of semantic features in the Gaussian embedding space to capture scene-level spatiotemporal dynamics.
2. **Gaussian World Model**: Parameterizes distributions in the dynamic Gaussian Splatting framework, providing informative supervision through future scene reconstruction.
3. **Performance Evaluation**: Outperforms state-of-the-art methods by 13.1% in average success rate on 10 RLBench tasks.
The method addresses the limitations of conventional methods by explicitly encoding scene dynamics, leading to more accurate action predictions and higher task success rates.