26 Mar 2024 | Cong Ma¹, Lei Qiao¹, Chengkai Zhu¹, Kai Liu¹, Zelong Kong¹, Qing Li¹, Xueqi Zhou¹, Yuheng Kan¹, Wei Wu¹,²*
HoloVIC is a large-scale multi-sensor dataset and benchmark for multi-sensor holographic intersection and vehicle-infrastructure cooperative perception. The dataset includes 3 types of sensors (Camera, Lidar, Fisheye) and 4 sensor layouts, with each intersection equipped with 6-18 sensors to capture synchronized data. HoloVIC contains over 100,000 synchronized frames and includes 3D bounding boxes annotated based on Camera, Fisheye, and Lidar data. The dataset also associates IDs of the same objects across different devices and consecutive frames. Based on HoloVIC, four tasks are formulated to facilitate research: Monocular 3D Detection (Mono3D), Lidar 3D Detection, Multiple Object Tracking (MOT), and Multi-sensor Multi-object Tracking (MSMOT). Additionally, Vehicle-Infrastructure Cooperation Perception (VIC Perception) is introduced as a task to evaluate the benefits of roadside perception for vehicles. The dataset is divided into training, testing, and validation sets, with 50%, 40%, and 10% respectively. The dataset is used to evaluate the performance of various perception and tracking algorithms, with results showing that VIC perception outperforms vehicle-side perception in most metrics. The dataset also includes detailed sensor specifications and benchmarking results for different sensor layouts and tasks. The HoloVIC dataset provides a comprehensive resource for research in autonomous driving and vehicle-infrastructure cooperation.HoloVIC is a large-scale multi-sensor dataset and benchmark for multi-sensor holographic intersection and vehicle-infrastructure cooperative perception. The dataset includes 3 types of sensors (Camera, Lidar, Fisheye) and 4 sensor layouts, with each intersection equipped with 6-18 sensors to capture synchronized data. HoloVIC contains over 100,000 synchronized frames and includes 3D bounding boxes annotated based on Camera, Fisheye, and Lidar data. The dataset also associates IDs of the same objects across different devices and consecutive frames. Based on HoloVIC, four tasks are formulated to facilitate research: Monocular 3D Detection (Mono3D), Lidar 3D Detection, Multiple Object Tracking (MOT), and Multi-sensor Multi-object Tracking (MSMOT). Additionally, Vehicle-Infrastructure Cooperation Perception (VIC Perception) is introduced as a task to evaluate the benefits of roadside perception for vehicles. The dataset is divided into training, testing, and validation sets, with 50%, 40%, and 10% respectively. The dataset is used to evaluate the performance of various perception and tracking algorithms, with results showing that VIC perception outperforms vehicle-side perception in most metrics. The dataset also includes detailed sensor specifications and benchmarking results for different sensor layouts and tasks. The HoloVIC dataset provides a comprehensive resource for research in autonomous driving and vehicle-infrastructure cooperation.