7 Jun 2024 | Yuxing Long, Wenzhe Cai, Hongcheng Wang, Guanqi Zhan and Hao Dong
InstructNav is a zero-shot system for generic instruction navigation in unexplored environments. It enables robots to navigate following diverse language instructions without prior training or pre-built maps. The system introduces Dynamic Chain-of-Navigation (DCoN) to unify planning for different navigation instructions, and Multi-sourced Value Maps to model key elements in instruction navigation, converting linguistic DCoN planning into robot actionable trajectories. InstructNav achieves zero-shot performance on the R2R-CE task and outperforms many task-training methods. It also surpasses previous state-of-the-art methods by 10.48% on the Habitat ObjNav and 86.34% on demand-driven navigation (DDN). Real robot experiments on diverse indoor scenes demonstrate the system's robustness in handling environmental and instruction variations. The system uses large language models and multimodal models to generate navigation plans, and it is designed to handle various instruction types, including object goal navigation, visual language navigation, and demand-driven navigation. InstructNav is the first generic instruction navigation system that can execute different types of instructions in a continuous environment without any navigation training or pre-built maps. The system's key contributions include DCoN and Multi-sourced Value Maps, which enable the system to handle diverse navigation tasks and adapt to new environments. The system has been tested on various datasets and has shown superior performance compared to existing methods.InstructNav is a zero-shot system for generic instruction navigation in unexplored environments. It enables robots to navigate following diverse language instructions without prior training or pre-built maps. The system introduces Dynamic Chain-of-Navigation (DCoN) to unify planning for different navigation instructions, and Multi-sourced Value Maps to model key elements in instruction navigation, converting linguistic DCoN planning into robot actionable trajectories. InstructNav achieves zero-shot performance on the R2R-CE task and outperforms many task-training methods. It also surpasses previous state-of-the-art methods by 10.48% on the Habitat ObjNav and 86.34% on demand-driven navigation (DDN). Real robot experiments on diverse indoor scenes demonstrate the system's robustness in handling environmental and instruction variations. The system uses large language models and multimodal models to generate navigation plans, and it is designed to handle various instruction types, including object goal navigation, visual language navigation, and demand-driven navigation. InstructNav is the first generic instruction navigation system that can execute different types of instructions in a continuous environment without any navigation training or pre-built maps. The system's key contributions include DCoN and Multi-sourced Value Maps, which enable the system to handle diverse navigation tasks and adapt to new environments. The system has been tested on various datasets and has shown superior performance compared to existing methods.