23 May 2024 | Yihan Wang, Lahav Lipson, and Jia Deng
SEA-RAFT is a novel and efficient variant of the RAFT optical flow algorithm, designed to achieve both simplicity and accuracy. It introduces several key improvements over the original RAFT:
1. **Mixture of Laplace Loss**: Instead of using the standard $L_1$ loss, SEA-RAFT employs a mixture of Laplace distributions to predict flow parameters, which helps reduce overfitting to ambiguous cases and improves generalization.
2. **Direct Regression of Initial Flow**: The initial flow is directly predicted using the context encoder, rather than being initialized to zero. This approach significantly reduces the number of iterations needed for convergence and improves efficiency.
3. **Rigid-Flow Pre-Training**: Pre-training on the TartanAir dataset, which provides optical flow annotations between stereo camera pairs, enhances the model's generalization capabilities.
These improvements result in state-of-the-art accuracy on the Spring benchmark, achieving a 3.69 endpoint-error (EPE) and a 0.36 1-pixel outlier rate (1px), representing a 22.9% and 17.8% reduction from the best published results, respectively. Additionally, SEA-RAFT outperforms existing methods on the KITTI and Sintel datasets, while maintaining high efficiency, operating at least 2.3 times faster than other methods with comparable performance. The code for SEA-RAFT is publicly available at <https://github.com/princeton-vl/SEA-RAFT>.SEA-RAFT is a novel and efficient variant of the RAFT optical flow algorithm, designed to achieve both simplicity and accuracy. It introduces several key improvements over the original RAFT:
1. **Mixture of Laplace Loss**: Instead of using the standard $L_1$ loss, SEA-RAFT employs a mixture of Laplace distributions to predict flow parameters, which helps reduce overfitting to ambiguous cases and improves generalization.
2. **Direct Regression of Initial Flow**: The initial flow is directly predicted using the context encoder, rather than being initialized to zero. This approach significantly reduces the number of iterations needed for convergence and improves efficiency.
3. **Rigid-Flow Pre-Training**: Pre-training on the TartanAir dataset, which provides optical flow annotations between stereo camera pairs, enhances the model's generalization capabilities.
These improvements result in state-of-the-art accuracy on the Spring benchmark, achieving a 3.69 endpoint-error (EPE) and a 0.36 1-pixel outlier rate (1px), representing a 22.9% and 17.8% reduction from the best published results, respectively. Additionally, SEA-RAFT outperforms existing methods on the KITTI and Sintel datasets, while maintaining high efficiency, operating at least 2.3 times faster than other methods with comparable performance. The code for SEA-RAFT is publicly available at <https://github.com/princeton-vl/SEA-RAFT>.