6 Dec 2016 | Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, Thomas Brox
FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
**Authors:** Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, Thomas Brox
**Institution:** University of Freiburg, Germany
**Abstract:**
FlowNet demonstrated that optical flow estimation can be cast as a learning problem, but traditional methods still outperform it, especially on small displacements and real-world data. This paper advances the concept of end-to-end learning of optical flow and significantly improves its quality and speed. The improvements are achieved through three major contributions: focusing on the training data schedule, developing a stacked architecture with warping, and introducing a subnetwork for small displacements. FlowNet 2.0 is only marginally slower than the original FlowNet but reduces the estimation error by more than 50%, performing on par with state-of-the-art methods while running at interactive frame rates. Additionally, faster variants are presented, allowing optical flow computation at up to 140fps with accuracy matching the original FlowNet.
**Introduction:**
The FlowNet by Dosovitskiy et al. introduced a paradigm shift in optical flow estimation by using a simple convolutional CNN to learn optical flow directly from data. However, it struggled with small displacements and noisy artifacts in real-world applications. FlowNet 2.0 addresses these issues, improving performance on tasks like action recognition and motion segmentation.
**Related Work:**
The paper reviews existing methods for optical flow estimation, including traditional approaches and deep learning-based methods. It highlights the importance of dataset schedules, network stacking, and specialized training for small displacements.
**Dataset Schedules:**
The paper investigates the impact of different training data schedules on the performance of FlowNet. It finds that the order of presenting training data matters, and a combination of datasets yields better results than using a single dataset.
**Stacking Networks:**
The paper introduces a stacked architecture for optical flow estimation, where multiple networks are trained iteratively to refine the flow estimate. This approach significantly improves accuracy and reduces overfitting.
**Small Displacements:**
To address small displacements, the paper creates a specialized training dataset and a network architecture. This specialized network performs well on small motions typical in real-world videos, and a fusion network is used to combine the estimates from different networks.
**Experiments:**
The paper evaluates FlowNet 2.0 on public benchmarks and real-world applications, showing that it outperforms existing methods in terms of accuracy and speed. It also demonstrates its reliability in motion segmentation and action recognition tasks.
**Conclusions:**
FlowNet 2.0 represents a significant advancement in optical flow estimation, achieving state-of-the-art accuracy while being orders of magnitude faster. It provides a robust and efficient solution for a wide range of applications requiring accurate and fast optical flow computation.FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks
**Authors:** Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, Thomas Brox
**Institution:** University of Freiburg, Germany
**Abstract:**
FlowNet demonstrated that optical flow estimation can be cast as a learning problem, but traditional methods still outperform it, especially on small displacements and real-world data. This paper advances the concept of end-to-end learning of optical flow and significantly improves its quality and speed. The improvements are achieved through three major contributions: focusing on the training data schedule, developing a stacked architecture with warping, and introducing a subnetwork for small displacements. FlowNet 2.0 is only marginally slower than the original FlowNet but reduces the estimation error by more than 50%, performing on par with state-of-the-art methods while running at interactive frame rates. Additionally, faster variants are presented, allowing optical flow computation at up to 140fps with accuracy matching the original FlowNet.
**Introduction:**
The FlowNet by Dosovitskiy et al. introduced a paradigm shift in optical flow estimation by using a simple convolutional CNN to learn optical flow directly from data. However, it struggled with small displacements and noisy artifacts in real-world applications. FlowNet 2.0 addresses these issues, improving performance on tasks like action recognition and motion segmentation.
**Related Work:**
The paper reviews existing methods for optical flow estimation, including traditional approaches and deep learning-based methods. It highlights the importance of dataset schedules, network stacking, and specialized training for small displacements.
**Dataset Schedules:**
The paper investigates the impact of different training data schedules on the performance of FlowNet. It finds that the order of presenting training data matters, and a combination of datasets yields better results than using a single dataset.
**Stacking Networks:**
The paper introduces a stacked architecture for optical flow estimation, where multiple networks are trained iteratively to refine the flow estimate. This approach significantly improves accuracy and reduces overfitting.
**Small Displacements:**
To address small displacements, the paper creates a specialized training dataset and a network architecture. This specialized network performs well on small motions typical in real-world videos, and a fusion network is used to combine the estimates from different networks.
**Experiments:**
The paper evaluates FlowNet 2.0 on public benchmarks and real-world applications, showing that it outperforms existing methods in terms of accuracy and speed. It also demonstrates its reliability in motion segmentation and action recognition tasks.
**Conclusions:**
FlowNet 2.0 represents a significant advancement in optical flow estimation, achieving state-of-the-art accuracy while being orders of magnitude faster. It provides a robust and efficient solution for a wide range of applications requiring accurate and fast optical flow computation.