28 Mar 2020 | Paul-Edouard Sarlin1*, Daniel DeTone2 Tomasz Malisiewicz2 Andrew Rabinovich2
SuperGlue is a neural network that matches two sets of local features by jointly finding correspondences and rejecting non-matchable points. It uses a differentiable optimal transport problem to estimate assignments, with costs predicted by a graph neural network. SuperGlue employs an attention-based context aggregation mechanism to reason about the underlying 3D scene and feature assignments. It outperforms traditional heuristics and other learned approaches in pose estimation for challenging indoor and outdoor environments. SuperGlue performs real-time matching on a modern GPU and can be integrated into SfM or SLAM systems. The code and trained weights are publicly available.
SuperGlue is designed to solve the problem of feature matching by finding a partial assignment between two sets of local features. It uses a graph neural network with attention to solve an assignment optimization problem, handling partial point visibility and occlusion. The network is trained end-to-end from image pairs, learning priors for pose estimation from a large annotated dataset. SuperGlue can be applied to various multiple-view geometry problems requiring high-quality feature correspondences.
The SuperGlue architecture consists of an attentional graph neural network and an optimal matching layer. The graph neural network uses self- and cross-attention to create powerful representations, while the optimal matching layer creates a score matrix and finds the optimal partial assignment using the Sinkhorn algorithm. The network is trained to predict the assignment matrix, with loss functions that minimize the negative log-likelihood of the assignment.
SuperGlue is equivariant to permutations of keypoints within an image and between images, making it suitable for symmetric problems. It outperforms other methods in homography estimation, indoor and outdoor pose estimation, and visual localization. SuperGlue achieves state-of-the-art results in pose estimation and is efficient, running in real-time on a GPU. It can be combined with any local feature detector and descriptor, working particularly well with SuperPoint. SuperGlue is trained on ground truth matches and can be used with both SIFT and SuperPoint features. It outperforms other methods in terms of precision, recall, and matching score, and is effective in challenging scenarios with repeated texture, large viewpoint changes, and illumination changes. SuperGlue is a major milestone towards end-to-end deep SLAM.SuperGlue is a neural network that matches two sets of local features by jointly finding correspondences and rejecting non-matchable points. It uses a differentiable optimal transport problem to estimate assignments, with costs predicted by a graph neural network. SuperGlue employs an attention-based context aggregation mechanism to reason about the underlying 3D scene and feature assignments. It outperforms traditional heuristics and other learned approaches in pose estimation for challenging indoor and outdoor environments. SuperGlue performs real-time matching on a modern GPU and can be integrated into SfM or SLAM systems. The code and trained weights are publicly available.
SuperGlue is designed to solve the problem of feature matching by finding a partial assignment between two sets of local features. It uses a graph neural network with attention to solve an assignment optimization problem, handling partial point visibility and occlusion. The network is trained end-to-end from image pairs, learning priors for pose estimation from a large annotated dataset. SuperGlue can be applied to various multiple-view geometry problems requiring high-quality feature correspondences.
The SuperGlue architecture consists of an attentional graph neural network and an optimal matching layer. The graph neural network uses self- and cross-attention to create powerful representations, while the optimal matching layer creates a score matrix and finds the optimal partial assignment using the Sinkhorn algorithm. The network is trained to predict the assignment matrix, with loss functions that minimize the negative log-likelihood of the assignment.
SuperGlue is equivariant to permutations of keypoints within an image and between images, making it suitable for symmetric problems. It outperforms other methods in homography estimation, indoor and outdoor pose estimation, and visual localization. SuperGlue achieves state-of-the-art results in pose estimation and is efficient, running in real-time on a GPU. It can be combined with any local feature detector and descriptor, working particularly well with SuperPoint. SuperGlue is trained on ground truth matches and can be used with both SIFT and SuperPoint features. It outperforms other methods in terms of precision, recall, and matching score, and is effective in challenging scenarios with repeated texture, large viewpoint changes, and illumination changes. SuperGlue is a major milestone towards end-to-end deep SLAM.