[slides and audio] Real-Time Anomaly Detection and Reactive Planning with Large Language Models

This paper presents a two-stage reasoning framework for real-time anomaly detection and reactive planning using large language models (LLMs). The framework consists of a fast binary anomaly classifier that analyzes observations in an LLM embedding space, which may trigger a slower fallback selection stage that utilizes the reasoning capabilities of generative LLMs. These stages correspond to branch points in a model predictive control strategy that maintains the joint feasibility of continuing along various fallback plans to account for the slow reasoner's latency as soon as an anomaly is detected, thus ensuring safety. The fast anomaly classifier outperforms autoregressive reasoning with state-of-the-art GPT models, even when instantiated with relatively small language models. This enables our runtime monitor to improve the trustworthiness of dynamic robotic systems, such as quadrotors or autonomous vehicles, under resource and time constraints. The framework is evaluated on several common LLMs, ranging from 10^8 to 10^12 parameters, as well as conventional out-of-distribution (OOD) detection techniques on an extensive suite of synthetic text-based domains, simulated and real-world closed-loop quadrotor experiments resembling a drone delivery service, and careful recreations of recent real-world failure modes of autonomous vehicles in the CARLA simulator. The framework is designed to detect and reason about the appropriate safety-preserving course of action in response to anomalies. It includes a fast reasoning with embeddings method that surpasses generative chain-of-thought (CoT) reasoning with high-capacity LLMs such as GPT-4. It also includes a slow reasoning through autoregressive generation method that allows the LLM-based monitor to methodically reason about the safety consequences of out-of-distribution scenarios and decide whether intervention is necessary in a zero-shot fashion. Finally, it includes a hierarchical multi-contingency planning method that facilitates the integration of both FM-based reasoners in a lower-level reactive control loop by maintaining multiple feasible trajectories, each corresponding to a high-level intervention strategy. The framework is evaluated on synthetic domains, including a Warehouse Manipulator domain, an Autonomous Vehicle domain, and a Vertical Take-off and Landing (VTOL) Aircraft domain. The results show that the fast anomaly detector performs favorably to generative reasoning-based approaches, and that the full approach can be integrated in a broader robotics stack for real-time control of an agile system. The framework is also evaluated on real-world hardware, including a quadrotor equipped with a downward facing camera, and the results show that the framework allows for dynamic control of the robot while leveraging the slower LLM to improve safety in a reactive, real-time manner. The results validate the framework's ability to detect and respond to anomalies in real-time, ensuring the safety of dynamic robotic systems.This paper presents a two-stage reasoning framework for real-time anomaly detection and reactive planning using large language models (LLMs). The framework consists of a fast binary anomaly classifier that analyzes observations in an LLM embedding space, which may trigger a slower fallback selection stage that utilizes the reasoning capabilities of generative LLMs. These stages correspond to branch points in a model predictive control strategy that maintains the joint feasibility of continuing along various fallback plans to account for the slow reasoner's latency as soon as an anomaly is detected, thus ensuring safety. The fast anomaly classifier outperforms autoregressive reasoning with state-of-the-art GPT models, even when instantiated with relatively small language models. This enables our runtime monitor to improve the trustworthiness of dynamic robotic systems, such as quadrotors or autonomous vehicles, under resource and time constraints. The framework is evaluated on several common LLMs, ranging from 10^8 to 10^12 parameters, as well as conventional out-of-distribution (OOD) detection techniques on an extensive suite of synthetic text-based domains, simulated and real-world closed-loop quadrotor experiments resembling a drone delivery service, and careful recreations of recent real-world failure modes of autonomous vehicles in the CARLA simulator. The framework is designed to detect and reason about the appropriate safety-preserving course of action in response to anomalies. It includes a fast reasoning with embeddings method that surpasses generative chain-of-thought (CoT) reasoning with high-capacity LLMs such as GPT-4. It also includes a slow reasoning through autoregressive generation method that allows the LLM-based monitor to methodically reason about the safety consequences of out-of-distribution scenarios and decide whether intervention is necessary in a zero-shot fashion. Finally, it includes a hierarchical multi-contingency planning method that facilitates the integration of both FM-based reasoners in a lower-level reactive control loop by maintaining multiple feasible trajectories, each corresponding to a high-level intervention strategy. The framework is evaluated on synthetic domains, including a Warehouse Manipulator domain, an Autonomous Vehicle domain, and a Vertical Take-off and Landing (VTOL) Aircraft domain. The results show that the fast anomaly detector performs favorably to generative reasoning-based approaches, and that the full approach can be integrated in a broader robotics stack for real-time control of an agile system. The framework is also evaluated on real-world hardware, including a quadrotor equipped with a downward facing camera, and the results show that the framework allows for dynamic control of the robot while leveraging the slower LLM to improve safety in a reactive, real-time manner. The results validate the framework's ability to detect and respond to anomalies in real-time, ensuring the safety of dynamic robotic systems.

Real-Time Anomaly Detection and Reactive Planning with Large Language Models

11 Jul 2024 | Rohan Sinha, Amine Elhafsi, Christopher Agia, Matthew Foutter, Edward Schmerling, Marco Pavone