The paper introduces MoCha-Stereo, a novel stereo matching framework designed to address the issue of geometric structure loss in feature channels, which often leads to edge detail mismatches. The core idea of MoCha-Stereo is to restore lost detailed features in feature channels by utilizing repeated geometric contours, referred to as motif channels. The key contributions include:
1. **Motif Channel Attention (MCA)**: This technique captures repeated geometric patterns in feature channels using sliding windows, enhancing the attention to geometric structures.
2. **Channel Affinity Matrix Profile (CAMP)-guided Correlation Volume**: This method constructs a correlation volume that leverages the affinity between motif and normal channels, improving the accuracy of cost computation for edge matching.
3. **Reconstruction Error Motif Penalty (REMP)**: This module refines the full-resolution disparity map by incorporating high and low-frequency information from the reconstruction error map, further enhancing detail matching.
MoCha-Stereo has been evaluated on several datasets, including KITTI-2015 and KITTI-2012, Scene Flow, ETH3D, Middlebury, and MVS domain. It achieves state-of-the-art performance on these benchmarks, demonstrating robust cross-dataset generalization and efficient iteration. The paper also includes ablation studies to validate the effectiveness of each component of the model.The paper introduces MoCha-Stereo, a novel stereo matching framework designed to address the issue of geometric structure loss in feature channels, which often leads to edge detail mismatches. The core idea of MoCha-Stereo is to restore lost detailed features in feature channels by utilizing repeated geometric contours, referred to as motif channels. The key contributions include:
1. **Motif Channel Attention (MCA)**: This technique captures repeated geometric patterns in feature channels using sliding windows, enhancing the attention to geometric structures.
2. **Channel Affinity Matrix Profile (CAMP)-guided Correlation Volume**: This method constructs a correlation volume that leverages the affinity between motif and normal channels, improving the accuracy of cost computation for edge matching.
3. **Reconstruction Error Motif Penalty (REMP)**: This module refines the full-resolution disparity map by incorporating high and low-frequency information from the reconstruction error map, further enhancing detail matching.
MoCha-Stereo has been evaluated on several datasets, including KITTI-2015 and KITTI-2012, Scene Flow, ETH3D, Middlebury, and MVS domain. It achieves state-of-the-art performance on these benchmarks, demonstrating robust cross-dataset generalization and efficient iteration. The paper also includes ablation studies to validate the effectiveness of each component of the model.