Cross-Modal Semi-Dense 6-DoF Tracking of an Event Camera in Challenging Conditions

Cross-Modal Semi-Dense 6-DoF Tracking of an Event Camera in Challenging Conditions

16 Jan 2024 | Yi-Fan Zuo¹, Wanting Xu², Xia Wang¹, Yifu Wang²¹, and Laurent Kneip²¹
This paper presents a computationally efficient, semi-dense cross-modal 6-DoF tracking approach for event cameras in challenging scenarios. The method uses semi-dense 3D maps generated either locally using a depth camera or globally using a visual SLAM or structure-from-motion framework. Two tracking methods are introduced: Canny-DEVO (Depth-Event Visual Odometry) and Canny-EVT (Event-based Visual Tracking). Both methods extract edge maps from event streams and perform 3D-2D edge alignment to estimate camera poses. The approach leverages signed time-surface maps (STSMs) to improve registration accuracy and introduces a novel culling strategy for occluded points. The method is validated on real datasets under challenging conditions and compared against traditional camera-based solutions. The results show that the proposed approach achieves highly accurate and efficient cross-modal tracking, outperforming alternative methods in dynamic and low-illumination scenarios. The paper also discusses related work, including RGB-based and event-based tracking methods, and highlights the advantages of using event cameras in combination with other sensors for robust localization. The framework is open-source and supports both event and regular camera inputs.This paper presents a computationally efficient, semi-dense cross-modal 6-DoF tracking approach for event cameras in challenging scenarios. The method uses semi-dense 3D maps generated either locally using a depth camera or globally using a visual SLAM or structure-from-motion framework. Two tracking methods are introduced: Canny-DEVO (Depth-Event Visual Odometry) and Canny-EVT (Event-based Visual Tracking). Both methods extract edge maps from event streams and perform 3D-2D edge alignment to estimate camera poses. The approach leverages signed time-surface maps (STSMs) to improve registration accuracy and introduces a novel culling strategy for occluded points. The method is validated on real datasets under challenging conditions and compared against traditional camera-based solutions. The results show that the proposed approach achieves highly accurate and efficient cross-modal tracking, outperforming alternative methods in dynamic and low-illumination scenarios. The paper also discusses related work, including RGB-based and event-based tracking methods, and highlights the advantages of using event cameras in combination with other sensors for robust localization. The framework is open-source and supports both event and regular camera inputs.
Reach us at info@study.space
[slides and audio] Cross-Modal Semidense 6-DOF Tracking of an Event Camera in Challenging Conditions