[slides] LIFT%3A Learned Invariant Feature Transform

The paper introduces LIFT (Learned Invariant Feature Transform), a novel deep network architecture that integrates the three key components of local feature detection and description: detection, orientation estimation, and feature description. Unlike previous methods that address each component individually, LIFT learns all three steps in a unified manner while preserving end-to-end differentiability. The architecture consists of three main components: the Detector, the Orientation Estimator, and the Descriptor, each based on Convolutional Neural Networks (CNNs). Spatial Transformers are used to rectify image patches given the output of the Detector and the Orientation Estimator. The soft argmax function is employed to preserve end-to-end differentiability. The training process involves a Siamese network, where the Descriptor is learned first, followed by the Orientation Estimator, and finally the Detector. The method is evaluated on several benchmark datasets, demonstrating superior performance compared to state-of-the-art methods without the need for retraining. The paper also discusses the importance of training the pipeline as a whole and the impact of each component on the overall performance.The paper introduces LIFT (Learned Invariant Feature Transform), a novel deep network architecture that integrates the three key components of local feature detection and description: detection, orientation estimation, and feature description. Unlike previous methods that address each component individually, LIFT learns all three steps in a unified manner while preserving end-to-end differentiability. The architecture consists of three main components: the Detector, the Orientation Estimator, and the Descriptor, each based on Convolutional Neural Networks (CNNs). Spatial Transformers are used to rectify image patches given the output of the Detector and the Orientation Estimator. The soft argmax function is employed to preserve end-to-end differentiability. The training process involves a Siamese network, where the Descriptor is learned first, followed by the Orientation Estimator, and finally the Detector. The method is evaluated on several benchmark datasets, demonstrating superior performance compared to state-of-the-art methods without the need for retraining. The paper also discusses the importance of training the pipeline as a whole and the impact of each component on the overall performance.

LIFT: Learned Invariant Feature Transform

29 Jul 2016 | Kwang Moo Yi,1, Eduard Trulls,1, Vincent Lepetit2, Pascal Fua1

LIFT: Learned Invariant Feature Transform

29 Jul 2016 | Kwang Moo Yi*,1, Eduard Trulls*,1, Vincent Lepetit2, Pascal Fua1

29 Jul 2016 | Kwang Moo Yi,1, Eduard Trulls,1, Vincent Lepetit2, Pascal Fua1