3 January 2024 | Lu Wen, Yongliang Peng, Miao Lin, Nan Gan and Rongqing Tan
This paper proposes a multi-modal contrastive learning strategy, DHT-CL, for LiDAR point cloud rail-obstacle detection in complex weather. The method combines camera and LiDAR sensor data without requiring image input during inference. It uses a Dual-Helix Transformer (DHT) to extract deeper cross-modal information through a neighborhood attention mechanism and constructs an obstacle anomaly-aware cross-modal discrimination loss for collaborative optimization. Experimental results on a complex weather railway dataset show that DHT-CL achieves an mIoU of 87.38%, outperforming other high-performance models from the autonomous driving dataset, SemanticKITTI. Qualitative results show that DHT-CL improves accuracy in clear weather and reduces false alarms in rainy and snowy conditions. The method is efficient for deployment, as it only requires point cloud input during inference. It also improves the general rain and snow resistance of deep learning-based methods, which cannot be addressed by filter-based data pre-processing. The framework extracts 2D and 3D features independently, then fuses them using the DHT module. The obstacle anomaly-aware cross-modal discrimination loss is used for collaborative optimization, allowing 2D branches to be discarded during inference. The method is effective in complex weather conditions, with performance improvements not limited to overlapping field-of-view regions. The contributions include a DHT module for robust sensor fusion in complex weather, an adaptive contrastive learning strategy, and a rail-obstacle detection method for multi-class unknown obstacles in complex weather. The method is evaluated on a complex weather railway dataset with point-wise annotation, showing superior performance compared to state-of-the-art models. The results demonstrate that DHT-CL improves accuracy in clear weather and reduces false alarms in rainy and snowy conditions. The method is efficient for deployment, as it only requires point cloud input during inference. It also improves the general rain and snow resistance of deep learning-based methods, which cannot be addressed by filter-based data pre-processing. The method is effective in complex weather conditions, with performance improvements not limited to overlapping field-of-view regions.This paper proposes a multi-modal contrastive learning strategy, DHT-CL, for LiDAR point cloud rail-obstacle detection in complex weather. The method combines camera and LiDAR sensor data without requiring image input during inference. It uses a Dual-Helix Transformer (DHT) to extract deeper cross-modal information through a neighborhood attention mechanism and constructs an obstacle anomaly-aware cross-modal discrimination loss for collaborative optimization. Experimental results on a complex weather railway dataset show that DHT-CL achieves an mIoU of 87.38%, outperforming other high-performance models from the autonomous driving dataset, SemanticKITTI. Qualitative results show that DHT-CL improves accuracy in clear weather and reduces false alarms in rainy and snowy conditions. The method is efficient for deployment, as it only requires point cloud input during inference. It also improves the general rain and snow resistance of deep learning-based methods, which cannot be addressed by filter-based data pre-processing. The framework extracts 2D and 3D features independently, then fuses them using the DHT module. The obstacle anomaly-aware cross-modal discrimination loss is used for collaborative optimization, allowing 2D branches to be discarded during inference. The method is effective in complex weather conditions, with performance improvements not limited to overlapping field-of-view regions. The contributions include a DHT module for robust sensor fusion in complex weather, an adaptive contrastive learning strategy, and a rail-obstacle detection method for multi-class unknown obstacles in complex weather. The method is evaluated on a complex weather railway dataset with point-wise annotation, showing superior performance compared to state-of-the-art models. The results demonstrate that DHT-CL improves accuracy in clear weather and reduces false alarms in rainy and snowy conditions. The method is efficient for deployment, as it only requires point cloud input during inference. It also improves the general rain and snow resistance of deep learning-based methods, which cannot be addressed by filter-based data pre-processing. The method is effective in complex weather conditions, with performance improvements not limited to overlapping field-of-view regions.