FlowNet: Learning Optical Flow with Convolutional Networks

FlowNet: Learning Optical Flow with Convolutional Networks

4 May 2015 | Philipp Fischer, Alexey Dosovitskiy, Eddy Ilg, Philip Häusser, Caner Hazırbaş, Vladimir Golkov, Patrick van der Smagt, Daniel Cremers, Thomas Brox
FlowNet is a convolutional neural network (CNN) architecture designed to estimate optical flow from pairs of images. The paper introduces two architectures: FlowNetSimple, which processes both images through a generic network, and FlowNetCorr, which includes a correlation layer to explicitly match features between images. The correlation layer enables the network to compare feature patches from different images, facilitating accurate optical flow prediction. To address the challenge of obtaining sufficient training data for optical flow estimation, the authors generate a synthetic dataset called Flying Chairs, consisting of random background images from Flickr with segmented chair images overlaid. This dataset allows for the generation of a large number of synthetic image pairs with varying displacements, enabling effective training of CNNs. The networks trained on this dataset generalize well to real-world datasets such as Sintel and KITTI, achieving competitive accuracy at frame rates of 5 to 10 fps. The paper also discusses the use of variational refinement to improve flow predictions and the importance of data augmentation in preventing overfitting. Experiments show that FlowNetCorr outperforms FlowNetSimple on various datasets, including Sintel and KITTI, and achieves state-of-the-art accuracy among real-time methods. The networks are trained end-to-end and can predict optical flow at up to 10 image pairs per second on the full resolution of the Sintel dataset. The results demonstrate the effectiveness of CNNs in optical flow estimation and highlight the potential of synthetic data in training robust models for real-world applications.FlowNet is a convolutional neural network (CNN) architecture designed to estimate optical flow from pairs of images. The paper introduces two architectures: FlowNetSimple, which processes both images through a generic network, and FlowNetCorr, which includes a correlation layer to explicitly match features between images. The correlation layer enables the network to compare feature patches from different images, facilitating accurate optical flow prediction. To address the challenge of obtaining sufficient training data for optical flow estimation, the authors generate a synthetic dataset called Flying Chairs, consisting of random background images from Flickr with segmented chair images overlaid. This dataset allows for the generation of a large number of synthetic image pairs with varying displacements, enabling effective training of CNNs. The networks trained on this dataset generalize well to real-world datasets such as Sintel and KITTI, achieving competitive accuracy at frame rates of 5 to 10 fps. The paper also discusses the use of variational refinement to improve flow predictions and the importance of data augmentation in preventing overfitting. Experiments show that FlowNetCorr outperforms FlowNetSimple on various datasets, including Sintel and KITTI, and achieves state-of-the-art accuracy among real-time methods. The networks are trained end-to-end and can predict optical flow at up to 10 image pairs per second on the full resolution of the Sintel dataset. The results demonstrate the effectiveness of CNNs in optical flow estimation and highlight the potential of synthetic data in training robust models for real-world applications.
Reach us at info@study.space
[slides and audio] FlowNet%3A Learning Optical Flow with Convolutional Networks