S³M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving

S³M-Net: Joint Learning of Semantic Segmentation and Stereo Matching for Autonomous Driving

29 Jan 2024 | Zhiyuan Wu, Yi Feng, Chuang-Wei Liu, Fisher Yu, Qijun Chen, Rui Fan
S³M-Net is a novel joint learning framework designed to simultaneously perform semantic segmentation and stereo matching, two critical components of 3D environmental perception systems for autonomous driving. The framework shares features extracted from RGB images between both tasks, enhancing overall scene understanding. A feature fusion adaptation (FFA) module transforms shared features into semantic space and fuses them with encoded disparity features. The entire learning process is trained using a semantic consistency-guided (SCG) loss function, which emphasizes structural consistency in both tasks. Extensive experiments on the vKITTI2 and KITTI datasets demonstrate the effectiveness and superior performance of S³M-Net compared to state-of-the-art single-task networks. The main contributions include the joint learning framework, the FFA module, and the SCG loss function. The framework shows improved performance in both semantic segmentation and stereo matching, outperforming other methods in various evaluation metrics.S³M-Net is a novel joint learning framework designed to simultaneously perform semantic segmentation and stereo matching, two critical components of 3D environmental perception systems for autonomous driving. The framework shares features extracted from RGB images between both tasks, enhancing overall scene understanding. A feature fusion adaptation (FFA) module transforms shared features into semantic space and fuses them with encoded disparity features. The entire learning process is trained using a semantic consistency-guided (SCG) loss function, which emphasizes structural consistency in both tasks. Extensive experiments on the vKITTI2 and KITTI datasets demonstrate the effectiveness and superior performance of S³M-Net compared to state-of-the-art single-task networks. The main contributions include the joint learning framework, the FFA module, and the SCG loss function. The framework shows improved performance in both semantic segmentation and stereo matching, outperforming other methods in various evaluation metrics.
Reach us at info@study.space