MoCha-Stereo: Motif Channel Attention Network for Stereo Matching

MoCha-Stereo: Motif Channel Attention Network for Stereo Matching

11 Apr 2024 | Ziyang Chen, Wei Long, He Yao, Yongjun Zhang, Bingshu Wang, Yongbin Qin, Jia Wu
MoCha-Stereo is a novel stereo matching framework that addresses the issue of geometric structure loss in feature channels, leading to edge detail mismatches. The method introduces Motif Channel Attention (MCA) to restore lost detailed features by leveraging repeated geometric contours in normal channels. It also proposes the Motif Channel Correlation Volume (MCCV) to enhance edge matching accuracy and the Reconstruction Error Motif Penalty (REMP) to refine full-resolution disparity estimation. MoCha-Stereo achieves state-of-the-art performance on multiple benchmarks, including KITTI 2015, KITTI 2012, Scene Flow, ETH3D, and Middlebury. It ranks first on the KITTI 2015 and KITTI 2012 Reflective leaderboards and demonstrates superior performance in multi-view stereo (MVS) tasks. The method also shows strong zero-shot generalization capabilities on the Middlebury and ETH3D datasets. MoCha-Stereo improves efficiency by reducing the number of iterations required for disparity estimation, achieving high accuracy with fewer iterations. The framework is implemented using PyTorch and has been validated through extensive experiments and comparisons with existing methods. The results show that MoCha-Stereo outperforms other state-of-the-art stereo matching algorithms in terms of accuracy, efficiency, and generalization. The method's core contributions include the use of motif channels to restore geometric structure, the development of MCCV for accurate cost computation, and the integration of REMP for disparity refinement. These innovations enable MoCha-Stereo to achieve high-quality stereo matching results across various scenarios and datasets.MoCha-Stereo is a novel stereo matching framework that addresses the issue of geometric structure loss in feature channels, leading to edge detail mismatches. The method introduces Motif Channel Attention (MCA) to restore lost detailed features by leveraging repeated geometric contours in normal channels. It also proposes the Motif Channel Correlation Volume (MCCV) to enhance edge matching accuracy and the Reconstruction Error Motif Penalty (REMP) to refine full-resolution disparity estimation. MoCha-Stereo achieves state-of-the-art performance on multiple benchmarks, including KITTI 2015, KITTI 2012, Scene Flow, ETH3D, and Middlebury. It ranks first on the KITTI 2015 and KITTI 2012 Reflective leaderboards and demonstrates superior performance in multi-view stereo (MVS) tasks. The method also shows strong zero-shot generalization capabilities on the Middlebury and ETH3D datasets. MoCha-Stereo improves efficiency by reducing the number of iterations required for disparity estimation, achieving high accuracy with fewer iterations. The framework is implemented using PyTorch and has been validated through extensive experiments and comparisons with existing methods. The results show that MoCha-Stereo outperforms other state-of-the-art stereo matching algorithms in terms of accuracy, efficiency, and generalization. The method's core contributions include the use of motif channels to restore geometric structure, the development of MCCV for accurate cost computation, and the integration of REMP for disparity refinement. These innovations enable MoCha-Stereo to achieve high-quality stereo matching results across various scenarios and datasets.
Reach us at info@study.space