[slides and audio] RoboEXP%3A Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation

The paper introduces RoboEXP, a novel robotic exploration framework that enables robots to autonomously explore environments and construct an Action-Conditioned Scene Graph (ACSG). The ACSG captures both low-level geometric and semantic information as well as high-level action-conditioned relationships between entities. RoboEXP incorporates a Large Multimodal Model (LMM) and explicit memory design to enhance its capabilities. The system reason about what and how to explore, accumulating new information through interactions and incrementally building the ACSG. The effectiveness and efficiency of RoboEXP are demonstrated through various real-world manipulation tasks involving rigid, articulated, nested, and deformable objects. The system's ability to handle diverse exploration scenarios and build complete ACSGs is evaluated, showing superior performance compared to a strong baseline. The reconstructed ACSG is crucial for guiding complex downstream manipulation tasks, such as setting up a table in a household environment. The paper also discusses limitations and future work, highlighting the need for improved perception robustness and enhanced LMM capacities.The paper introduces RoboEXP, a novel robotic exploration framework that enables robots to autonomously explore environments and construct an Action-Conditioned Scene Graph (ACSG). The ACSG captures both low-level geometric and semantic information as well as high-level action-conditioned relationships between entities. RoboEXP incorporates a Large Multimodal Model (LMM) and explicit memory design to enhance its capabilities. The system reason about what and how to explore, accumulating new information through interactions and incrementally building the ACSG. The effectiveness and efficiency of RoboEXP are demonstrated through various real-world manipulation tasks involving rigid, articulated, nested, and deformable objects. The system's ability to handle diverse exploration scenarios and build complete ACSGs is evaluated, showing superior performance compared to a strong baseline. The reconstructed ACSG is crucial for guiding complex downstream manipulation tasks, such as setting up a table in a household environment. The paper also discusses limitations and future work, highlighting the need for improved perception robustness and enhanced LMM capacities.

RoboEXP: Action-Conditioned Scene Graph via Interactive Exploration for Robotic Manipulation

2024 | Hanxiao Jiang, Binghao Huang, Ruihai Wu, Zhuoran Li, Shubham Garg, Hooshang Nayyeri, Shenlong Wang, Yunzhu Li