AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving

AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving

26 Mar 2024 | Mingfu Liang, Jong-Chyi Su, Samuel Schulter, Sparsh Garg, Shiyu Zhao, Ying Wu, Manmohan Chandraker
The paper introduces AIDE (Automatic Data Engine for Object Detection in Autonomous Driving), a system designed to automate the data curation, model training, and verification processes for autonomous driving systems. AIDE leverages recent advancements in vision-language models (VLMs) and large language models (LLMs) to identify issues, curate data efficiently, improve the model through auto-labeling, and verify the model through diverse scenario generation. The system operates iteratively, allowing for continuous self-improvement of the model. The authors establish a benchmark for open-world detection on AV datasets to evaluate various learning paradigms, demonstrating superior performance at a reduced cost. Key components of AIDE include the Issue Finder, Data Feeder, Model Updater, and Verification. The Issue Finder uses dense captioning models to identify missing categories, the Data Feeder employs VLMs to query relevant images, the Model Updater performs pseudo-labeling and continuous training, and the Verification step evaluates the model's robustness under diverse scenarios. Experimental results show that AIDE outperforms existing methods in novel object detection, achieving a 2.3% AP improvement on novel categories and an 8.7% AP improvement on known categories compared to the state-of-the-art zero-shot object detection method, OWL-v2. The paper also includes ablation studies and analysis to validate the effectiveness of each component of AIDE.The paper introduces AIDE (Automatic Data Engine for Object Detection in Autonomous Driving), a system designed to automate the data curation, model training, and verification processes for autonomous driving systems. AIDE leverages recent advancements in vision-language models (VLMs) and large language models (LLMs) to identify issues, curate data efficiently, improve the model through auto-labeling, and verify the model through diverse scenario generation. The system operates iteratively, allowing for continuous self-improvement of the model. The authors establish a benchmark for open-world detection on AV datasets to evaluate various learning paradigms, demonstrating superior performance at a reduced cost. Key components of AIDE include the Issue Finder, Data Feeder, Model Updater, and Verification. The Issue Finder uses dense captioning models to identify missing categories, the Data Feeder employs VLMs to query relevant images, the Model Updater performs pseudo-labeling and continuous training, and the Verification step evaluates the model's robustness under diverse scenarios. Experimental results show that AIDE outperforms existing methods in novel object detection, achieving a 2.3% AP improvement on novel categories and an 8.7% AP improvement on known categories compared to the state-of-the-art zero-shot object detection method, OWL-v2. The paper also includes ablation studies and analysis to validate the effectiveness of each component of AIDE.
Reach us at info@study.space