RoboHop: Segment-based Topological Map Representation for Open-World Visual Navigation

RoboHop: Segment-based Topological Map Representation for Open-World Visual Navigation

9 May 2024 | Sourav Garg, Krishan Rana, Mehdi Hosseinzadeh, Lachlan Mares, Niko Sünderhauf, Feras Dayoub, Ian Reid
The paper "RoboHop: Segment-based Topological Map Representation for Open-World Visual Navigation" introduces a novel topological map representation for environments using image segments. This approach leverages recent advancements in image segmentation (e.g., SAM) and vision-language coupling (e.g., CLIP) to create a graph where segments serve as nodes and edges form connections based on segment-level descriptors and pixel centroids. The method enables semantic and open-vocabulary queryable mapping, providing a continuous sense of place through inter-image persistence and intra-image connectivity. The paper demonstrates how this representation can generate navigation plans in the form of "hops" over segments and search for target objects using natural language queries. Experiments on real-world data show the effectiveness of the proposed method in segment-level data association, localization, planning, and navigation, with preliminary trials indicating zero-shot real-world navigation capabilities. The contributions include a novel topological representation, a mechanism for intra- and inter-image connectivity, and methods for generating semantically interpretable segment-level plans. The paper also discusses limitations and future work, highlighting the potential for incorporating visual servoing and integrating metric information to enhance navigation capabilities.The paper "RoboHop: Segment-based Topological Map Representation for Open-World Visual Navigation" introduces a novel topological map representation for environments using image segments. This approach leverages recent advancements in image segmentation (e.g., SAM) and vision-language coupling (e.g., CLIP) to create a graph where segments serve as nodes and edges form connections based on segment-level descriptors and pixel centroids. The method enables semantic and open-vocabulary queryable mapping, providing a continuous sense of place through inter-image persistence and intra-image connectivity. The paper demonstrates how this representation can generate navigation plans in the form of "hops" over segments and search for target objects using natural language queries. Experiments on real-world data show the effectiveness of the proposed method in segment-level data association, localization, planning, and navigation, with preliminary trials indicating zero-shot real-world navigation capabilities. The contributions include a novel topological representation, a mechanism for intra- and inter-image connectivity, and methods for generating semantically interpretable segment-level plans. The paper also discusses limitations and future work, highlighting the potential for incorporating visual servoing and integrating metric information to enhance navigation capabilities.
Reach us at info@study.space