22 Jun 2024 | Vanya Cohen, Jason Xinyu Liu, Raymond Mooney, Stefanie Tellex, David Watkins
This survey reviews the literature on robotic language grounding, focusing on the trade-offs between using formal representations and high-dimensional vector spaces. It categorizes approaches into two poles: mapping language to manually defined formal representations and mapping language to high-dimensional vector spaces that directly translate to low-level robot policies. Formal representations, such as temporal logic, planning domain definition language (PDDL), and code, offer precise meaning representation, interpretability, and formal safety guarantees but limit the learning problem size. High-dimensional vector spaces, while requiring more data and computational resources, can be more flexible and generalizable but lack the structuredness and interpretability of formal methods. The survey discusses the benefits and limitations of each approach and highlights the need for future work that combines the strengths of both methods. It also addresses open problems such as the choice of formal language, the integration of multimodal inputs, and the generalization and safety of deep learning models in robotic language grounding.This survey reviews the literature on robotic language grounding, focusing on the trade-offs between using formal representations and high-dimensional vector spaces. It categorizes approaches into two poles: mapping language to manually defined formal representations and mapping language to high-dimensional vector spaces that directly translate to low-level robot policies. Formal representations, such as temporal logic, planning domain definition language (PDDL), and code, offer precise meaning representation, interpretability, and formal safety guarantees but limit the learning problem size. High-dimensional vector spaces, while requiring more data and computational resources, can be more flexible and generalizable but lack the structuredness and interpretability of formal methods. The survey discusses the benefits and limitations of each approach and highlights the need for future work that combines the strengths of both methods. It also addresses open problems such as the choice of formal language, the integration of multimodal inputs, and the generalization and safety of deep learning models in robotic language grounding.