| Christian Kerl, Jürgen Sturm, and Daniel Cremers
This paper presents a dense visual SLAM method for RGB-D cameras that minimizes both photometric and depth errors across all pixels. Unlike sparse feature-based methods, this approach better utilizes image data to achieve higher pose accuracy. An entropy-based similarity measure is introduced for keyframe selection and loop closure detection. A graph is built from successful matches and optimized using the g2o framework. The method is evaluated on benchmark datasets, showing strong performance in low-texture and low-structure scenes. It outperforms state-of-the-art methods in trajectory error. The software is released as open-source.
The method combines dense visual odometry with pose graph optimization. It uses a fast frame-to-frame registration method that optimizes both intensity and depth errors. An entropy-based method selects keyframes to reduce drift. Loop closures are validated using the same entropy metric. All techniques are integrated into a graph SLAM solver to further reduce drift.
The method estimates camera motion from RGB-D image streams. At each time step, the camera provides an RGB-D image with intensity and depth data. The goal is to calculate the rigid body motion between consecutive images. The method minimizes photometric and geometric errors to estimate motion.
The camera model uses homogeneous coordinates to define 3D points. A rigid body motion is represented by a transformation matrix. The method derives a warping function to compute pixel locations in the second image given a rigid body motion.
Photometric and depth errors are defined based on the warping function. The photometric error is the difference in intensity between two images. The depth error is the difference between predicted and actual depth measurements. A probabilistic formulation is used, with the photometric error modeled as a t-distribution.
The method linearizes the error function and solves a non-linear least squares problem. The error function is minimized using a first-order Taylor expansion. The normal equations are solved iteratively, with the scale matrix and weights updated using an expectation maximization algorithm.
Keyframe selection is based on entropy ratios. The entropy of the parameter distribution is used to determine when a new keyframe is needed. Loop closure detection uses the same entropy ratio test. The method represents the map as a graph of camera poses, with edges representing relative transformations between keyframes.
The method is evaluated on the RGB-D benchmark provided by the Technical University of Munich. It outperforms RGB-only and depth-only methods in most datasets. The method achieves a 170% reduction in global trajectory error. It is efficient, with a processing time of around 32ms per frame. The software is released as open-source.This paper presents a dense visual SLAM method for RGB-D cameras that minimizes both photometric and depth errors across all pixels. Unlike sparse feature-based methods, this approach better utilizes image data to achieve higher pose accuracy. An entropy-based similarity measure is introduced for keyframe selection and loop closure detection. A graph is built from successful matches and optimized using the g2o framework. The method is evaluated on benchmark datasets, showing strong performance in low-texture and low-structure scenes. It outperforms state-of-the-art methods in trajectory error. The software is released as open-source.
The method combines dense visual odometry with pose graph optimization. It uses a fast frame-to-frame registration method that optimizes both intensity and depth errors. An entropy-based method selects keyframes to reduce drift. Loop closures are validated using the same entropy metric. All techniques are integrated into a graph SLAM solver to further reduce drift.
The method estimates camera motion from RGB-D image streams. At each time step, the camera provides an RGB-D image with intensity and depth data. The goal is to calculate the rigid body motion between consecutive images. The method minimizes photometric and geometric errors to estimate motion.
The camera model uses homogeneous coordinates to define 3D points. A rigid body motion is represented by a transformation matrix. The method derives a warping function to compute pixel locations in the second image given a rigid body motion.
Photometric and depth errors are defined based on the warping function. The photometric error is the difference in intensity between two images. The depth error is the difference between predicted and actual depth measurements. A probabilistic formulation is used, with the photometric error modeled as a t-distribution.
The method linearizes the error function and solves a non-linear least squares problem. The error function is minimized using a first-order Taylor expansion. The normal equations are solved iteratively, with the scale matrix and weights updated using an expectation maximization algorithm.
Keyframe selection is based on entropy ratios. The entropy of the parameter distribution is used to determine when a new keyframe is needed. Loop closure detection uses the same entropy ratio test. The method represents the map as a graph of camera poses, with edges representing relative transformations between keyframes.
The method is evaluated on the RGB-D benchmark provided by the Technical University of Munich. It outperforms RGB-only and depth-only methods in most datasets. The method achieves a 170% reduction in global trajectory error. It is efficient, with a processing time of around 32ms per frame. The software is released as open-source.