[slides and audio] Optical Flow Estimation Using a Spatial Pyramid Network

The paper "Optical Flow Estimation using a Spatial Pyramid Network" by Anurag Ranjan and Michael J. Black introduces a novel method for optical flow estimation that combines classical spatial pyramid formulations with deep learning. The method, called Spatial Pyramid Network (SPyNet), estimates large motions in a coarse-to-fine approach by warping one image of a pair at each pyramid level and computing an update to the flow. Unlike traditional methods, SPyNet trains a deep network at each pyramid level to compute the flow update, rather than minimizing an objective function. This approach has several advantages: it is simpler and 96% smaller than FlowNet in terms of model parameters, making it more efficient and suitable for embedded applications. The learned convolutional filters resemble classical spatio-temporal filters, providing insights into the method's effectiveness. SPyNet achieves comparable or better accuracy than FlowNet on standard benchmarks such as Sintel, KITTI, and Middlebury, while being significantly faster and more memory-efficient. The paper also discusses the limitations of the method, particularly in handling large motions of small or thin objects, and suggests future directions for improvement, including the use of more frames and larger training datasets.The paper "Optical Flow Estimation using a Spatial Pyramid Network" by Anurag Ranjan and Michael J. Black introduces a novel method for optical flow estimation that combines classical spatial pyramid formulations with deep learning. The method, called Spatial Pyramid Network (SPyNet), estimates large motions in a coarse-to-fine approach by warping one image of a pair at each pyramid level and computing an update to the flow. Unlike traditional methods, SPyNet trains a deep network at each pyramid level to compute the flow update, rather than minimizing an objective function. This approach has several advantages: it is simpler and 96% smaller than FlowNet in terms of model parameters, making it more efficient and suitable for embedded applications. The learned convolutional filters resemble classical spatio-temporal filters, providing insights into the method's effectiveness. SPyNet achieves comparable or better accuracy than FlowNet on standard benchmarks such as Sintel, KITTI, and Middlebury, while being significantly faster and more memory-efficient. The paper also discusses the limitations of the method, particularly in handling large motions of small or thin objects, and suggests future directions for improvement, including the use of more frames and larger training datasets.

Optical Flow Estimation using a Spatial Pyramid Network

21 Nov 2016 | Anurag Ranjan, Michael J. Black