2 May 2024 | Youquan Liu*, Lingdong Kong*, Xiaoyang Wu, Runnan Chen, Xin Li, Liang Pan, Ziwei Liu, Yuexin Ma
M3Net is a novel framework for universal LiDAR segmentation that enables multi-task, multi-dataset, and multi-modality LiDAR segmentation using a single set of parameters. The model addresses the challenges of heterogeneous LiDAR data by performing data-, feature-, and label-space alignments during training. This approach allows M3Net to effectively handle diverse LiDAR datasets with varying sensor configurations, data patterns, and label spaces. Extensive experiments on twelve LiDAR segmentation datasets demonstrate that M3Net achieves state-of-the-art performance, with mIoU scores of 75.1%, 83.1%, and 72.4% on SemanticKITTI, nuScenes, and Waymo Open, respectively. The model also excels in knowledge transfer and out-of-distribution adaptation, showcasing its robustness and generalizability. M3Net's approach combines multi-sensor data, including images and text, to enhance feature alignment and label space consistency. The framework supports multi-task learning by integrating knowledge from different sensor data sources, leading to improved performance across various tasks. The model's effectiveness is validated through comprehensive experiments, ablation studies, and comparisons with existing methods, highlighting its potential for practical applications in autonomous driving.M3Net is a novel framework for universal LiDAR segmentation that enables multi-task, multi-dataset, and multi-modality LiDAR segmentation using a single set of parameters. The model addresses the challenges of heterogeneous LiDAR data by performing data-, feature-, and label-space alignments during training. This approach allows M3Net to effectively handle diverse LiDAR datasets with varying sensor configurations, data patterns, and label spaces. Extensive experiments on twelve LiDAR segmentation datasets demonstrate that M3Net achieves state-of-the-art performance, with mIoU scores of 75.1%, 83.1%, and 72.4% on SemanticKITTI, nuScenes, and Waymo Open, respectively. The model also excels in knowledge transfer and out-of-distribution adaptation, showcasing its robustness and generalizability. M3Net's approach combines multi-sensor data, including images and text, to enhance feature alignment and label space consistency. The framework supports multi-task learning by integrating knowledge from different sensor data sources, leading to improved performance across various tasks. The model's effectiveness is validated through comprehensive experiments, ablation studies, and comparisons with existing methods, highlighting its potential for practical applications in autonomous driving.